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Foreword 



In teaching reading to young people in the elementary and secondary 
schools, one of my objectives was to help students adapt their rate and 
style of reading to their purpose and to the material. In working with 
both inservice and prcscrvice classes, I made certain that the 
importance of this skxll-^rcading flexibility, to use the term suggested 
by the author of this publication— was Frequently presented. A-^ an 
adult reader, and an observer of the reading habits of other adults, 1 
am aware that flexibility has all but vanished From our repertoire of 
skills and that many of us are rather inflexible in our reading habits. 

In title. Dr. Rankin presents the concept that llexibility involves the 
reader's reading skills, his psychological state, and the difficulty of the 
material he reads. Certainly, such a concept is consistent with the view 
that the act of reading is basically an aspect of the thinking process. 
Dr. Rankin reinforces the importance of the teacher's understanding 
of the learning processes and his realization that effective teaching of 
reading is premised upon knowledge of the learning style of each 
pupil. 

The officers and members of the Board of Directors of the 
Association express their appreciation to Dr. Rankin for this contribu- 
tion to the Association and to more effective learning to read by 
young people and adults throughout the world. 



Millard H. Black, President 
International Reading Association 



1973-1974 




Preface 



The vvriicr wishes lo acknowledge the contrihution of two papers in 
helping him h)catc signiluant references and in suggesting valuable 
insights related to measiuement problems. One paper is an unpub- 
lished manuscript entitled **Rea(ling Flexibility: An Investigation** by 
Kathleen A. Jongsma ( 197 1 ); and the other is ''Assessment of Flexible 
Kfficient Reading" by Phil L. Naeke (1971). " 

This monograph is an outgrowth of an earlier paper entitled "The 
Measurement of Reading Flexibility'' printed in ihc Occasional Papers 
hi Rcaditiif series under the auspices of tlie Reading Program of the 
huliana University School of Education. It is, essentially, a revision of 
the original paper witli substantive modifieations and editing changes. 
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Introduction 



'I'his study is concerned with the measurement of reading llexibility. 
As such, it emphasizes primarily dil'IVrent measurement procedures 
and does not attempt to present a comprehensive investigation of the 
results of such procedures which have been utihxicd many times in the 
study of reading flexihihty and its correlates. Since various concepts 
of reading llexibility intluence techniques of measurement, it is 
necessary first to consider a few representative concepts as they are 
revealed in verbal definitions and a summary of variables indicated in 
these definitions, only some of which have been used in research on 
the measurement of different types of reading llexibility. Second, for 
reasons to be given, certain kinds of measurements or studies are 
excluded from this review. Third, published tests which purport to 
measure reading llexibility are described and evaluated, and then a 
variety of informal tests and measurement procedures found in the 
research literature, which might have implications for the construction 
of more adequate tests of reading flexibility, are reviewed. 

Following the review of the literature on concepts of reading 
flexibility and techniques for their measurement, a summary of areas 
of agreement and positive findings is presented. This is followed by a 
critical evaluation of instruments and techniques of measurement 
which have been used in previous research. An attempt is made to 
point out both the strengths and weaknesses which characterize 
efforts to measure this important aspect of reading. 

The ccmcluding section of this study includes a proposed model for 
reading flexibility which suggests needed research and development on 
the measurement process. Finally, other recommendations are pre- 
sented for research and development of more valid and useful 
measurements of reading flexibility in the years ahead. 
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Concepts of Reading Flex!l)ility 

Many (lirierent concepts of reading flexibility have been found in 
researeh on this topie. Differences in concepts are rellected in verbal 
definitions by ''authorities" usin^ a number of different variables 
which have inlluenced the construction of measurement instruments. 
Several definitions are now presented, and these are followed by a 
summary of variables revealed in these definitions. Finally, some types 
of excluded measurement procedures and studies are specified which do 
not include measurements conforming to most concepts of reading 
flexibility. 

Defiyiitions 

The following quotations are probably representative of different 
points of view or degrees of emphasis about components of reading 
flexibility. Perhaps one of the curliest definitions was presented by 
Carrillo and Sheldon (1952): 

The mature reader is the adaptable, versatile reader; he should be able to adapt 
his rate of reading to the purpose with which he approaches the printed page, 
and to the difficulty level of the material. The goal is understanding at an 
adequate level. 

A different emphasis was observed in this definition by Berg (1967): 

In general, the term refers to the activity a reader is engaged in when he sets up 
various patterns of thinking relative to his reading needs and then selects the 
skills that best accomplish this purpose. The term also implies that the 
reader can carry out the reading activity selected with an optimum of 
comprehension for the time expended. 

A very broad concept Wcts indicated in Stauffcr's definition of 
tlexibility as . . a high rale of efficiency in satisfactory attainment 
of the reader's purpose" (I9(i2). McDonald ( 1 963, 1965, 1967) in hi.s 
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numy vvritinj.»is on this subject lias consistcmly rcjccicd tlic notion (hat 
flcxihility is (lu* adjustiiKMit of rate best suited to purpose and reading 
material, lie has cnipliasi/ed llie adjiistmenl of reading a()fm)(iclies 
(i.e., perceptual and cognitive processes, reaihng skills, stiuly teeh- 
nicpies) as being necessary to gain an miderstanchng of the author's 
meaning as (hetated by tlie reader's purpose, lie also lias included the 
eonccpl ol' a mininiiuii expenditure of psychological and pliys'ologieal 
effort within his conee|)t of reading flexibility. 

McOracken (19()5) has made a distinction between internal flexi- 
hiliiy (ilic adjustment of rale and approach within the sentences and 
paragraphs which make np an.arliele) and external flcxihility (similar 
adjiisinienls between total passages^. 'This is an important distinction 
because very little study has been given to the investigation of internal 
fiexibilUy. 



Summary of variables 

rile prc^vioiis defiiiiiion.s are in agreement that reading flexibility 
involves a relationship between one or more dependent variables 
involving some changes in reading behaviors and some one or more 
independent variables involving differences within the reader or within 
and/or between materials. Mere changes in reader behavior from one 
point in time to another are not indications ol' reading llexibility. 

Several diFferent variables have been indicated in these varying 
concepts. Independent variables include reader piu-pose and diffieidty 
of luateriaL Dependent variables include reading rate, reading ap- 
proaches, mininumi cHort, maximinn efficiency, purpose attainment, 
and optimum comprehension. 



Excluded Studies 

It is evident that there is no such entity as reading flexibility. Rather, 
iherc are different types of reading llexibilities. The following review 
of techniques of measurement reinforces this conclusion. However, 
there are types of studies sometimes considered as reading llexibility 
investigations, which do not involve the measurement of reading 
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ne\il)ility as indicated by most (let'iniliDns. One oT lliese is an 
investigation wiiieh manipulates dilTerentes l)elween individuals as 
independent or dependent varial)les. For example, if two ecjuated 
groups read two passages written at different levels of difficulty and 
displayed significani mean differences in rate, this would not be a 
measurement of reading flexibility. Sudi a study would suggest 
hypotheses which could be included in the measurement ol reacling 
ne\il)iliiy, but all definitions agree tliat intramdividual dillereiices in 
l)ehavior as a function of some one or more independent varial)lcs is 
an essential ingredient in tlie concept of reading flexii)ility. These 
types of interindividual studies will i)e excludetl from tliis paper 
except insofar as tiiey suggest important ideas for tiie measurement of 
reading flexibility ► 

Anotiier type of study to i)e excluded from this survey is the 
meiLSurement of the relationship between rate and comprehension. 
The writer considers i)oth rate and comprehension as dependent 
variai)les whicli result from ciianges in reading approaches as a 
function of the manipulatir)n of some one or more independent 
variables, 'rherefore, such studies do not measure reading flexii)ility as 
the term is used by tiiis writer. 

As previously stated, studies of mere changes in reading i)ehavior 
over time unrelated to some independent variable, arc not included 
within the concept of reading flexii)ility. 

Finally, reading flexii)ility is essentially a positive concept. It is a 
desired outcome of learning. It does not include any change in reader 
beliavior as a function of a change in an independent variai)Ie. 
There Tore, the study of the relationship i)etween interest-appeal of 
materials and reading rate which might indicate a correlation (either 
positive or negative) between the two variables does not measure 
reading nexii)ility. Certainly, tiie measurement of the relationsiiip 
between material cUfficulty and comprehension, wiiich siiows that 
comprehension declines as material difficulty increases, would give no 
information about reading llexibility. '[*he meastircmcnt of reading 
ilexibility necessarily entails a study of the ability of readers to adjust 
their behavior under two or more conditions so as to accomplish their 
reading purp().sc(s). Such changes in behavior are rellections of 
desirable changes in reading approaches. A further discussion of this 
conceptualization of reading flexibility is presented in the chapter on 
the ^Model of Reading Flexibility/' 
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rccliniciucs of Measuring Reading Flexibility 

Various published tests eonstitute operational delinitiuns of reading 
flexibility vvliieh have inriucMu-ed the outcome of many research 
studies. There are only a lew published tests oT this skill, and these 
tcfls are, in the writer's opinion, rat.icr primitive from the standpoint 
of adequate standards of measurement. These are now described and 
evaluated. Following the review of published tests, a number of 
unpublished tests and measurement procedures used in research on 
reading flexibility are considered. 

Piihlislied Tests 

Test of Reaclinj^ Flexibility, This test was devised by Spache and Berg 
in 1958 for use by college students and adults (Spache, 1956). It was 
published in a book entitled Faster Reading for Business, now out of 
print. The writer believes this was the first published test of reading 
flexibility. The test attempts to measure flexibility by studying the 
effects of variations in reading the same article three times for three 
different purposes. The reading pas.sage i.s a 2,800 word article about 
forecasting for business. On the first reading (skimming), the reader is 
given a three minute time limit to read the article and told that he will 
be asked to answer ten questions on main points without looking back 
at the article. The second reading (scanning) requires the reader to 
read ten questions in advance and find the answers by referring to the 
selection. The timing on this reading includes the time it takes to 
answer all questions. The third reading (reading for a thorough 
understanding) is read without time limits with instruetions to read in 
order to answe^ twenty detailed questions without looking back at the 
selection. Timing is based on reading time only. The questions cover 
facts, inferences, and conclusions. 

The 'lest of Reading Flexibility attempts to measure the relation- 
ship between purpose as an independent variable and rate measured as 
time spent. Comprehension is measured as attainment of the assigned 
purpose and must be adequate, as indicated by norms, in order for 
rate measurement to be intcrpretable. A very desirable feature of this 
test is that material and reader variables such as difficulty, background 
information, and interest appeal are held constant while purpose is 
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(pivsiniial)ly) sysU'malli'ally v;iricil by spccilu* ;uul (iciniy slated 
rcadinjr objcciises. Of in)iirsc, ihc oxlriu Id whicli individiuil readers 
acrcpl or rcjcii ihc iissij;;ncd piirpDscs is nut known. Tlu' raulinj; 
passa^i^c is longer lh;in passa,u;es used in many other tests of reading 
llexibiliiy. Unlike a sid)se(iiiently developed test battery whieh 
allenipls to measure tlie results of skimming with only three questions 
and the results of seanning with only one question (i.e., Reading 
Versatility Tests), this test provides ten (|iiestions to measure 
eomprehensioji on hoth skills. Twenty questions are used to measin*e 
thorough comprehension, and a test oi' this length might have suitable 
reliability. All test questions are midtiple-ehoice with four alternatives. 

Separate norms are given on a five point seale ranging from poor to 
cxci'llcnl lor both rate and eomprehension. Tiic hook provides an 
extensive discussion for the interpretation of test results. No attempt 
was made by the authors io establisli criteria for the interpretation :>r 
differences in rale, as such. .\s will be noted later, this is a distinct 
advantage. Instead, the reader can interpret his rate on scanning and 
liioroiigh reading individually on a normative basis, provided that his 
comprehension was satisfactory according to the norms. 

Unfortunately, no inh)rmation was prt)vided by the uithors 
regarding the si/.e or characteristics of the normative sample. Judging 
from Spache (1956), the normative data were gathered informally by 
loaning test copies for trial use in exchange for accumulalion ol test 
results for various groups. No information was given in the hook about 
the reliability or validity of either rate or comprehensum scores. In 
fairness to the authors, it must be admitted that this book was hardly 
an appropriate place for technical informatii>n. However, the writer is 
informed by Dr. Berg iKat the test was constructea informally without 
obtaining this kind of technical data. It should be noted that the 
sidijcct matter of the reading passage is more appropriate lor aduhs in 
business than for many college students. 

The lest of Reading Flexibility, although perhaps the first 
published test of reading llexibiliiy and nmv out of print, has many 
desirable features which might well be emulated by hUure con- 
structors of llexibiliiy tests. Due li) the excellent control over many 
variables, other than purpose, which might inlluence changes in rate 
and comprehcnf:ion, the residts of this test are more easily inter- 
pretable than the resuUs of several tests whieh have been published 
subsequently. 
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Fliwihility of Hcadini^ Trst, CiMislriu'tcil i)y Braam ami Sheldon, 
Iwo of iliv ihrcc iDrms of lliis test svcrc piihlishcHl in a book for 
colk\y[c' sliiilcnts entitled Dcvclifping Efficient Rcadin^^ (1959). The 
third ronn has not hccii published. The test contains live passages 
rep rescn tint; live tlilTerent types oT material: narrative, literature, 
scicMuc, liistory, ami psyt'holo,u[y. The articles range in lenj^^th IVom 
7r)()-9()() words. An attempt is made to hold purpose constant for all 
seleclions by in.striu'tini* die reader to read as quickly as he can and 
still understand the ,u;eneral content of the selection. The n*ader is 
linied on each passage to obtain a rate measured in words pti iUinute. 
No lime limits arc iiseil lor reailing the materials. After reading each 
article, the reader attempts to answer ten true or false que.stion.s. 
Strangely, the rcailer is not told whether to refer to the selections in 
answering questions. Presumably, he is expected to answer these 
questions without referring to the text. The degree of flexibility is 
measured l)y the amount of difference between the .slowest and the 
i'a.stesl rates on the live passages for a given student. Comprehension 
scores are expresseil as percentages. 

As Braam (I9(>!^) has noteil, there were several problems of control 
in the construction of this test. He rightly pt)inted oui that purpose 
ftn* reading eould not really be ct)ntrt)lled by test instructions. He 
noted that the Dale-Chall readability formula measurements for each 
pa.s.sage showed the pa.ssage difficulties to range [ram grades nine-ten 
through grade sixteen. The directions for the test inform the student 
that he is to read passages covering ilifferent subjects which are of 
varying levels of difficulty and familiarity. 

In the light of these confounded independent variables, it is 
difficult to interpret the meaning of the test results. Differences in 
rale and/ or comprehension may be due to any one or more of these 
variables for a particular student. A table was pnr'ided for interpreting 
llexibiliiy (i.e., difference in rate between the slowest and fastest 
passage) on a seven point scale ranging from very poor lo outstanding. 
However, the basis for these normative categories was not explained in 
the book. More importantly, the use of difference scores to measure 
llcxihility in this and other tests raises two very fundamental 
questions. 

First, as Thorndike and Hagen (1961) have shown, when a 
difference is taken between two test scores, the reliability of the 
difference is usually much h)wer than the reliability of (he two tests 
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upon Nvluch ihc dinVroiu'c was l)asc(l. This rcduflion in reliability is 
due Id the Tad thai, in ihe sublracru)n pi'DCCss, whatever faclDis are 
common to l)t)th measures arc cancelled out. and the dilTcrencc scdtc 
contains only those factors which are specific tt) both tests plus the 
errors ol* measurement oT both tests put together. It can be shown that 
this reduction in reliability of the chlTerence score increases as the 
correlation between the two individual measurements increases. 
'I'heuretically, if the correlation between the two tests were perfect, the 
difference scores would rellect only errors of measurement; that is, 
the reliability would be zero ('rhi>rndike and Hagen, 1961). Since 
reading rate scores tend to be highly correlated despite differences in 
diffinilty and nature of materials (Bloomers and Lindquist, 1944; 
Carlson, 195 I ), it is unlikely that dilTerence scores in reading rate will 
have satisfactory reliability for the measurement of reading flexibility, 
even if the reliabilities of each individual rate test are substantial. 
Hence, on statistical grounds alone, it is highly improbable that the 
flexibility measurements in this test have suitable reliability and, 
therefore, validity. Second, even if the difference scores on this or 
other tests had satisfactory reliability, the question still remains as to a 
possible interaction effect between difference scores and status on the 
lowest rate measurement involved in the difference, due to the 
measurement procedure itself. This procedure involves subtracting the 
difference in rate between the slowest and fastest rate. Unless it can be 
shown that there is no such test-induced interaction, a person whose 
lowest rate .score is very fast might not have the same chance to make 
a high flexibility score as a person whose lowest rate score is rather 
sU)vv. If this were the case, difference scores established at different 
levels of speed would not be comparable measurements. 

The interpretation of comprehension s':ores in this test is not clear. 
Obviously, ten true or false qtiestions are not a reliable measurement 
of comprehension. No norms are provided for these tests. The reader 
is cautioned that a rapid rate (?) accompanied by a comprehension 
score of below 80 percent and a feeling of lack of understanding 
** . , . would, of course, not be a valid basis for comment" (Braam and 
Sheldon, 1959). He is also told that a comprehension score of 70 
percent or below indicates a need to read more slowly. Apparently 
these percentage criteria were arrived at in a completely arbitrary 
manner. Their chief function is probably to keep the rate scores 
**honest." They do not constitute an adequately measured dependent 
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Viiriablc of coinpri'luMisiuii purpose alUiiniiK'iU in llic icsl uT llcxi- 
l)ilily» 

Ai'cordini; Id l^niani (h)l)!i), cunipavahilily of lest forms vviis 
iu'i'oniplislied by (akiii,u; iwu selei lions lor eaeli of llic five passages 
from a fcMinnoii source. No evidence was ji;iveii for llie effectiveness of 
this pviu-ednre Also, lu) evidence was ^iven re;i;ar(ling coinparabiliiy of 
Icsl c|neslipns eillu*r wilhin or between forms. Order ellecl.s in reading 
a sequcnci- of passages were apparently not considered. 'I'he authors 
provided no evidence of reliability or validity of measurements for 
rale, comprehension, or flexibility. 

However, Berber (h)r)9) stiulied the interform reliability of the rate 
scores aiul nioililicd multiple-i'hoii'e comprehension questions for the* 
three f(n*ms of the test. The true or false comprehension items were 
iipparenlly modified to reduce the effects of j^uessiniL^ on these scores. 
The rale reliability loefficients (based on seventy college students} 
between forms for the same subject areas ranged from .56 to .83. 
These relial)ilily figures were quite K)W for individual or group use. 
Often, the interform rate correlations were liigher belioecn subject 
than loithin subjei t area.s. 1 he reliability coef fit'ients for comprehen- 
sion items within subject areas rangeil from -.!^5 to M). 'I'hese figures 
were, of course, quite low. Many correlations did not even attain 
statistical signdicance at the .05 level. Again, many of the betvve^en- 
area ct)rrelations were higher than the within-aroa correlations. If 
Berger's multiple-choice questions were an improvement over the true 
or false questions in the published test, the reliability (and, hence, the 
vaUdity) of the pul)lished comprehension tests must be low indeed, 
Berger (I!)()9) found high rale reliabilitic's (but low comprehension 
relial)dittes) for the test as a whole. However, the relevance of Uiese 
linilings is iK>t clear. The essential reliability a test of Hcxibility 
depends upon the reliability of rate and comprehension scores for the 
subtests, nt>t the total test. 

It must be concluded that the ITexibility of Reading Test is a highly 
dclVclive instrument lacking in most of the accepted technical 
prerequisites for good test construi tion. It might be assumed that this 
test, like (he previously described test by Spache and Berg, was 
published as part of a workbook and, tlKTcfore, is not subject to 
rigorous conventional technical criteria as applied to. standardized 
tests. However, the test was used by Braam as a measuring instrument 
in a scientific study. An attempt by the writer to obtain technical data 
about the test from the authors proved unsuccessful. 
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Reading Versatility Tests, *riicsc tests were first published in 1962 
by McDonald et al. and were revised in 19()8. This was the first 
published battery of tests of reading flexibility to be used by readers 
of various grade levels and to provide four equated forms at each 
residing level. Whether the term reading level is to be interpreted as 
reading achievement level or as grade level is not clear in the Manual of 
Directions (McDonald, 19G8). The three levels are Basic (levels 5-8), 
Intermediate (levels 8-12), wwA Advanced (levels 12-college). 

Each test contains four selections, each differing in difficulty, style, 
and theme. Each of these four passages is supposed to be read for a 
different assigned reading purpose which requires different ap- 
prt)aches. McDonald speaks of such variables as skills, ways of 
thinking, and psychological set as approaches. According to the 
niainual, the first selection is a fiction passage which is to be read 
rapidly but with attention directed to important facts, main ideas, and 
dettiils. The second selection is a non fiction passage which is to be 
read carefully with attention to details, main ideas, and implications. 
The reader is supposed to skim the third passage looking for main 
ideas. The fourth passage is to be scanned in order to locale the 
answer to one question provided in advance. All questions, with the 
exception of the scanning task, are answered without looking back at 
the passage. There are no time limits. Reading time before answering 
questions is converted to a words-per-minute score. Ratios of reading 
rates lor various selections are used to indicate efficiency in varying 
reading approaches for different purposes (i.e., flexibility). Selections 
one and two are followed by ten questions, the third selection (to be 
skimmed) is followed by three questions, and the fourth selection (to 
be scanned) requires the student to answer only one question. 

At first glance, the Reading Versatility Tests appear to be an 
impressive and ambitious attempt to provide a new and better 
measurement of reading flexibility for readers of many ages. It has all 
of the advantages of multiple forms and publication of individual tests 
free from the confines of a book. This was not the case with the two 
previously described tests. However, this series of tests contains some 
serious Haws. 

No norms are provided in the manual for rate measurements on 
these test materials. Instead, the reader is referred to the Educational 
Developmental Reading Laboratory norms established for the Reading 
Eye Camera. These norms were established on different materials of 
short, duration with statements of purpose different from those used 
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in the Rcaclinj^ Vcrsalilily 'IVsls. Furthermore, these norms were 
limited io readers at each j^rade level who, while attaehed to a eamera, 
attained at leasi a 70 percent seore on comprehension tests. 
Consequently, these results are prohahly not representative of readers 
in general. Some impressive interform reliability eoelTicients lor rates 
are given in liie manual lor selections one and two ranging from ,82 to 
.90. Detailed reliability coefficients are, fearsome reason, not given fo** 
all forms of the test at each level. These reliability coefficients in the 
manual may, or may not, he representative of reliahilily coefficients 
for all test forms of ihe Reading Vcrsatilily Tests. 'J'he rale reliability 
measurements lor selections three and four combined arc quite low 
with correlations ranging from .51 to ,70. Had these two parts not 
been combined, the rate reliabilities would undoubtedly have been 
lower. It is, of course, the reliabilities of individual parts of the total 
test which are particularly relevant to the measurement of reading 
ilexibility. 

Diflercnces hetsveen rates on various subtests are converted to rate 
ratios hy dividing the rate obtained on a particular passage read for 
one purpose hy the rate ohtained on another passage which was read 
for another purpose. Criteria were provided in the manual for the 
interpretation of the efficiency of these ratios, but no basis was given 
for the manner in which these criteria were determined, it is 
significant that no reliability figures at all were given for these rate 
ratios in the manual. If other investigators have ohtained this 
information, the writer has been unable to locate such studies. It may 
he that McDonald et al. simply assumed that if the individual rate 
measures were sufficiently reliable, consequently it could be logically 
deduced that the ratios hascd on these rale measurements would be 
reliable also. Unfortiuiately» this is not necessarily true. Rate ratios 
based upon differences hctween rates may have the same deficiencies 
in reliability as difference scores, themselves. This problem is the same 
as described in the sectit^n on the Flexibility of Reading Test by 
Braam and Sheldon. P'ven with adequate individual rate reliabilities 
(which all four subtests in each total test of this battery do not have) 
the reliability of a difference score on correlated measurements must 
be low. I he following quotation by Gullikscn (1950) points out the 
similarities between reliahilily deficiencies of difference scores and 
ratio scores which are simply another way of expressing differences: 

When the accomplishment quotient (AQ) was introduced, Kelly pointed out 
that the problem involved was to obtain reliable measures of each variable. 
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Cllcarly ilie.sc nuMSiircs wouUt he corrclatctl so that the uccotnplishmciU 
quotient lui.nht icfU'ci dmIv t'i'ri>rs of iniMsuR'iiu'iu. 

Now if tlic nitcria which McDonald ct al. used lor ciriLicnt rale 
ratios were based on minimal rate diri'crences suflieientlx large to 
exceed the standard error oi the tneiisin'ctneiu Tor particular dilter- 
enees in scores invoUed in tompuling rate ratios, this rniiLijht be an 
iniprovemenl in test eonstriiction. IKiwever, since they used the same 
criteria for inter])reting rale ratios on all test lornis at all levels, it is 
unlikely thai this technique was lusjil. The use of the same criteria for 
interpreting the el'liciency ol din'crent rate ratios based on dirferent 
levels ol the distribution ol' ilirierent tests is ciuestionable. As 
C'.ronbach (1941) pointeil oui, "'...ratio scores within the same 
population and even ratios which arc equal in si/e, may not have the 
same reliability, llie standard error ol' the measuremeiu increases as 
the denominator decreases/' 

In addition to the problem of the reliability of these rate ratios, like 
the di I Terence scores there may he a test induced statistical interaction 
between the status ol the base rate I'or the dil rerencc and the ratio 
obta'uicd. Until diis issue is resolved. It is not meantngt'ul to talk about 
slow readers with a high rate ratio versus last readers with a low rate 
ratio, etc. A desirable fcatiirc oC the lest was an atteinpt to reduce the 
effect of previous informatitm on comprehension test perrc)rmance for 
selections one and two by using reading related items. Questions 
which could be answered correctly by 40 percent or more of the 
students who had not read a given test passage were discarded or 
rewritten. This procedure was apparenily successful^ because, despite 
the slujft length of the comprehension tests for selections one and 
two, concurrent validity coefficients between these subtests obtained 
for several groups and the Diagnostic Reading 'lest: Survey Section; 
the Level of Comprehension subtest of the Cooperative Reading Tests: 
Reading Comprehension; and the Davis Reading Test were c|uitc 
substantial with correlations ranging from .()7 to .89. 

Despite the impressive concurrent validity of the comprehension 
tests for selections one and two, their interpretation for an individual 
studen is subject to question because no norms were provided. 
Instead, a GO percent criterion was recommended as a minimum score 
on both selections one and two if the rate scores were to be 
considered meaningful. H' it is assumed that .selections one and two 
were each written at substantially different levels of readability, and if 
the test for each passage was a valid meusure for the comprehension of 
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that part it'iilar passiii^c, how would it he possihlc to use the same GO 
pciTcnl criterion for the fouipreheiisii)n of botli passiij^cs? 

There were also no norms or relial)ility coeriuients provided lor I lie 
skimming and scanning compreliension tests in the manuiil. The 
manual instructs the reader to (hsregard the rate scores For skimming if 
tlie student missed more tlian one question (out of three) and on 
scanning it' he missed the one (and only) question on that lest. No 
evidence was given for the manner in which these criteria were 
eslahhshcd. On sui h a small nunil)er of questions and hy using these 
apparently arbitrary criteria, how can the student he iiirormed that 
"lie may he trying to use skimming and scanning techniques hut lacks 
the reqiiisite skills'' (McDiMiald el al.)? Also as Maxwell (I9G9) has 
indicated, adequate reliai)ility ol" comprehension on skimming and 
scanning cannot be obtained with only a lew items. 

The manual claims that ct)mparahility ol Terms was ol)tained by 
using articles in each form for each part which were comparable in 
dirficuity, content, interest appeal, and style. Curiously, however, no 
evidence (other than a lew selected intcrform reliability cocrficients) 
was provided in support of these claims. 

All of McOonahrs writings on reading I'lexibility have placed a great 
deal of emphasis upon adjustment of reading approaches rather than 
reading rates as fundamental to iiis concept of reading rie\ii)ility. in 
the writer's opini<)n, however, it is precisely this factor of approach 
which the Reading Versatility Tests do not really measure adequately. 
Although space does ni>t permit documentation of this point, 
McDonald's statements of purpose are vaguely stated and might 
incUicc a variety of approaches in different readers. Also, the 
comprehension questions do not necessarily reveal the approach used. 
As Nacke (1^J71) has indicated, there are different skimming strategies 
which are not measured by answering test questions. There also may 
bea variety of approaches Lo finding a main idea or remembering details 
which are not revealed by answering Lest questions. Due to the many 
confonnded variables which influence performance on the Reading 
Versatility 'I'ests (difficulty, style, ccmUciU, interest appeal, purpose), 
diff<a'ent individuals might be induced to use a variety of approaches 
on the same subtest, and this would not necessarily be reflected in 
their answers to test questions. The manual does claim that evidence 
from cye-movcment photographs were consistent with the a.ssuinption 
that differential approaches were being used. But little empirical 
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evidence was ciicd in the test manual (other than a reference to a 
research paper by Welch, 191)4) to support this contention. In any 
case, the question still remains about the adecjuary ol" eye-movement 
photographs to relleet the many subtleties of cognitive and perceptual 
functions involved in reading approaches. No introspective evidence 
known to the writer has been gathered on these tests to obtain 
evidence about the reader's acceptance of assigned purposes or his 
perception of bis approaches used to attain these purposes. liven if 
comprehension test results were conclusive evidence of approach used, 
the tests do not make provisions for analysis of test results on 
selection.s one and two to determine purpose attainment such as main 
ideas, details, and inferences. Rate, rather than approach, is the only 
carefully measured dependent variable in this test, but even tbi.s 
measurement is of doubtful validity cxpres.sed in ratio form. As the 
writer has previously indicated, the causal variables which might 
influence test performance are .so confounded, that test results are 
difficult to interpret. McDonald ct al. (1968) maintained that only 5 
percent of high school and college students are Hexible as measured by 
his test. Would greater differences in readability between selections 
one and two produce greater rate ratios? Do confounded influences of 
style, background information, or interest appeal operate, perhaps, to 
reduce the effects of differences in readability upon rate ratios? Would 
different criteria for the interpretation of rate ratios based on 
adequate empirical evidence show that perhaps people arc more 
flexible after all? More fundamentally, are rate ratios valid measures of 
differences in approach used in reading for different purposes? Would 
more reliable measures of comprehension in skimming and scanning, 
together with more realistic comprehension criteria for eliminating 
consideration of these rate scores for individual students because of 
inadequate comprehension, reveal a higher degree of reading flexi- 
bility? 

These and many other questions remain unanswered. Given the 
many inadequacies of this test, results of studies which have been 
based on the use of the Reacling Versatility Tests should be regarded 
with caution. It is unfortunate that so many often quoted findings and 
conclusions about the nature and extent of reading flexibility have 
been based upon measures provided by the Reading Versatility Tests. 

Reading Test. Published in 1970, this te.st was constructed by 
Raygor for use by college-bound high school juniors and seniors and 
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;ils(> college IVcshiiuMi and sopliomoirs. Two lornis of test have 
been published. Rending flexihiliiy is oidy one of several skills 
measured hy this lest, and the conunetUs in this paper aie liniiied to 
this part of the total test and the test lor skiiniuing iuicl scanning. 
Flexibility is measured hy cDmparison ot* reading rates on two articles 
written at dii'fereni levels of dilTiculty and written about clirfcreiil 
topics. The first article resembles easy recreational material vnd is, 
according to the manual (Raygor 1970), written at the level of 
difficulty of most ncwspiipcrs, magazines, and ncn'cls. The second 
article is similar to more difficult study materials found in college 
textbooks. Reliability of rate scores for these passages was not given 
in the manual. Rate, in words per minute, is measured in each article 
by the number of words read dnring the first three minutes oi' reading 
time. The reader is given a total of five minutes t(^ complete each 
article. The first passage is roughly 1,800 words in length; the second, 
ab()ut 1 ,(>0() words. 

Kach passage is folhnved by ten multiple-choice questions with four 
alternatives. These questions measure only the retention of factual 
content. A ten miiuitc time limit is given for answering test questions. 
'I'here is no specifically assigned purpose for reading the two articles 
Involved in the measurement of flexibility. The purpose is est ablished 
by describing the general nature of the material in the two passages 
and informing the readers that they will be required to answer ten test 
questions on each passage which will determine h(nv well they have 
understood the material. Readers are told the time limits for reading 
and answering questions before they start each part of the test. 

Another part of the test is designed to measure skimming aiul 
.scanning. Students are told they will have to obtain information 
quickly without actually reading all the material. They are given thirty 
test questions in advance which they may attempt to answer by 
referring to various kinds of materials (indexes, charts, bibliographies, 
and textbook excerpts), all of which are printed on blue pages in 
contrast to the white pages in the rest of the text. The reader is told 
that he will have ten minutes to complete all thirty items. 

Unlike all previous tests o f reading flexibility, excellent norms were 
provided for this test based on a large standardization sample. 
Separate norms were obtained for C(^llegc-bound juniors and seniors in 
high sch(K)l, four-year college freshmen and sophomores, and two-year 
college students. Percentile, standard scores, and staninc norms were 
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provided lor rates on i)olli easy ami di 11 u uit passa^^es, rate diflVreiu*es 
(i.e., riexil)ilily), eonibiiieil rcleniion tests on boili passages, and 
comprelieiision scores lor skimming and seanniiig. 

In several respects, tliis test represents a ilistiiiet improvement over 
otlier piiblislied tests. I'he normative data are comprehensive and arc 
based on a large sample of subjects representing a large geographic 
area. The skimming and scanning lesi is an impiovement over previous 
tests. It utilizes many dirfcreiu kinds of nialeriais and a 
large number of test itcm.<s. In the writer's opinion, this latter test 
might better be called a ''scanning" test aincc the questions tend to 
direct the readers' search lor the answers to factual questions rather 
than general ideas. A rather important feaiure of this test is the 
manner in which purpi>se is induced. Instead of giving specific assigned 
purpo.ses, the reader is given a brief description about the nature and 
difficulty of each passage involvcti in the measurement of reading 
llexibiiity, and in the light of this general information he is allowed to 
make his own choice regarding rate and approach, 'I'lie ability lo 
choose one's own purpose in the light of relcviint information is an 
important dimension of reading flexibility. 

There are, however, some deficiencies in this test whicii should be 
considered. The rate measurements arc based on t)nly the first three 
minutes of reading. Traxler (19!^8) found that measuring reading rale 
<luring a short time {i.e., from one to five minittes) resulted in 
uiu'cliable measiu'cments. Although the compreliension items for the 
ilexibility test are limited to factual items, the student is not informed 
of this fact in advance. Therefore, many students might be reading for 
other more study- type purposes, particularly on the textbook selec- 
tion and might consequently make a low score on factual items. 
Strangely, norms were provided only for combined tests for both 
passages nwoived in the measurement of reading flexibility, so the 
meaningfulness of the individual rate .score from each passage cannot 
be interpreted in arriving at the measure of flc Nihility. The announce- 
ment of time limits in advance of testing for each part of the test 
might have differential effects on individuals who have or do not have 
watches or groups who have or do not have wall clocks. Of course, the 
resulting rate differences fcjr individual readers between the two test 
selections used in the measurement of flexibility may be <luc to a 
number of confoimdcd variables such as purpose, article difficulty, 
interest appeal, background information, etc. More fundamentally, the 
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previous discussions of rcliahilily problems (»!' iliffcrcncc scores apply 
to this test and cast serious tloubt as to the reliability aiul vaUdity of 
lliesc diriVrence scores and the meaning of norms based upon them. 
Unless the rate dilTerence scores on this test have a very high 
reliability combined with a low correlation between tests, small 
dilTcrences in rate arc not reliable, and norms based on them are 
essentially meaningless. According to the manual, the correlation 
between tlie two rate scores on this test is .70. From Table 7.8 in 
Thonulikc and llagen (lytil) it can be deduced that, if the average 
reliability of these two rate scores were .80, the reliability of the rale 
dilTerence scores would be In the light of this observation, it 

follows that the possible validity of these flexibility measurements is 
limited and the norms based upon small rate differences are 
meaningless. 

The manual (1970) gives twenty Kuder-Richardson reliability 
coefficients for .several parts of the test and the total Reading Test. 
The authors indicated that these figures are spuriously high on the 
skimming and scanning comprehension results because of the speed 
factor. No separate reliability figures were given for comprehension 
scores* on the easy and difficult passages involved in the measurement 
of flexibility. Even on the combined comprehension tests for these 
two passages, the reliability is only .65. It is understandable, therefore, 
that the individual comprehension test results for these two different 
passages are ignored. No reliability information was given for either 
rate or rate differences in the manual. High content validity is claimed 
for the test. It is also claimed that information about comprehension 
items which could be guessed without reading the passages is used in 
constructing the test. Precisely how this information was used and the 
results of such use are not revealed. 

Correlations were computed between the total score and various 
subtest scores of the Reading Test and the total and subtest scores of 
the Nelson-Denny Reading Test. These results might be of interest 
from a scientific point of view, but the Nelson-Denny Reading Test is 
really not a meaningful criterion for the concurrent validity of the rate 
scores, llexibility scores, or skimming-scanning scores of the Reading 
Test. 

On the whole, the Reading Test includes some desirable features as 
a measure of reading flexibility not possessed by other tests. 
Unfortunately, iike the previously described tests, flexibility is 




Review ol* Lilcratmv 



mcMSi^rcd by VMc di ffcrciiccs ol unknown iind prohiiUIy very low 
rcliiibility. It is stv;ingc tliat not a single one of (he lesl authors who 
has allempled to measure riexihility by dirference or ratio scores has 
been aware of this problem in test construction. If these authors were 
familiar with this measurement difficulty, it is ncU reflected in iheir 
tests or test manuals. 



Kxpcrini cnlal fn ens u rem en ts 

This seclioi^ is devoted to consideration of a number of different 
measurement proi-edures which provide a variety of operational 
definitions of reading llexibility. Some of these procedures nrr 
unpublished tests designed, like published tests, to measure individual 
differences in reading flexibility. Other investigations involve the 
experimental manipulation of one or more independent variables in 
order to study their effects on changes involving various reader 
behavior variables which are treated as dependent variables in these 
investigations. Ahhough it is not usually the purpose of these studies 
to measure individual differences in reading flexibility, they often do 
provide information which might be used in the construction of such 
tests. Some writers (e.g., Jongsma, 1971) have considered these latter 
types of studies as investigations of correlates of reading flexibility. 
However, in terms of the prevailing conception of reading flexibility 
indicated in both definitions and published measuremcMits a.s- con- 
sisting of certain changes in reader behaviors (dependent variables) 
brought about as a function of changes in or between materials or 
within individuals (independent variables), these studies are considered 
as research on variables involved in the measurement of reading 
flexibility. This section is organized in terms of different categories of 
independent variables: material variables and reader variables. Exam- 
ples of studies within each category are given. They use different 
categories of dependent variables or different ways of measuring the 
relationship between these two kinds of variables. Other studies which 
were essentially replications of the same measurement procedures are 
not described. A few pieces of research are considered which, although 
they do not contain measures of reading flexibility as such, still shed 
some light on measurement problems. 
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lii(l('(h-ti(lt'ni iUiriahU's: male rials, A inuDljcr of invc.sligation.s have 
rclaU'd nialcriiil (iiaraclcnslirs, usually dirficuliy or conlcnl, lo one 
t>r more aNpccls of rcMdcr behavior. Until tlic Ntiuly l)y Rankin iiiul 
lloss (h)7l)) was (\>n(lu(*U*(l, all previous studies in diis ealej^ory 
measured dinVrenees in render behavior in rehuion to diriereuees 
lielwcen arlieles. These lypcs of sludies measured wliat MeCrackeu 
( !!)(>")) has called ''external ficxihiliiy/' 

Several early suulies (Kleseh, 1949; Brown, 1952) showed thai rale 
iind eomprehensinn ehanged in rei;uion lo die readabilily oT niateriids, 
bul the sludies left many variables uneontrolled. Kelson (1959) 
eoiulucied m\ iniportanl study whieh compared die relative cnVets of 
nuueri»d dirt'icuUy and reader purpose upon rale and ^'omprchension. 
l.'sin)i» collei;e . I'reshmen, Letson had sid)jects read an easy and a 
(hriicull iU'iicIe willi a common asslj^ned purpose. They were to read as 
rapitlly as possible and still undersland the material suirieicnlly to 
answer (piestions afterwards without lookin*; hack. The students also 
read two equally diiricull articles Tor two dilTerent purposes: I) to 
read as rapidly as possible for ihe story, and 2) to read Tor complete 
mastery (d' ideas and details. All passages exceeded 2,500 words and 
were said to be comparable with respect to subject matter. All articles 
were Tollowed by comparal)le ct)mprehension tests with respect to 
dinicully aiul other technical criteria. Letsoii neglected to ^ive the 
readability levels ol these articles and to describe the comprehension 
tests. .\lso he used a Five minute reading time limit which undoubtedly 
kept many students from completing the passages. His Hndings, 
although not subject to statistical tests, indicated that dilierences in 
dinicidty have considerably more influence on reading rate than 
dilTerences in purpose. lie even noted that instructions to read lor 
mastery caused some students to read faster rather than slower. These 
I'indings have implications for future study of reading flexibility, and 
his design suggests a suitable way of measuring and comparing the 
elTeels of two different indepencient variables upon reading perfor- 
mance, l ie also made a separate analysis ot data for those subjects- whose 
rales were different than expecteci in comparing reading performance 
(m both lypes of materials and for both purposes. Fie called this 
'^negative ncxibility." This is a phenotncnom that is in need of further 
study. His potentially important fuulings must !)c considered as 
suggestive, since there was no information given about the precise 
nature of several important control variables and no statistical tests of 
significance were made. 
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A study simihir to Lcison's was carried out by Levin (19()8) usini^ 
l)rii»;lu niiuii ii[rade ^irls and a slightly different comparison of 
purposes. I'his study will not be dcserihed, for it contributes nothing 
new in the way of sui^^^estions for measurement procedures. However, 
Levin's investigation might be consulted, because it is characteri/.ed by 
a much greater degree of scientific sophistication. Readability figures 
were given for all articles, reliability coefficients were given for tests, 
and tests of significance were made in drawing conclusion.s. 

It should be noted that comprehension is only an incidental factor 
in the past two studies mentioned. Comprehension was used chiefly to 
induce the proper mental set for readers and to make the rale scores 
meaningful. In the Levin study, it could have been used as a measure 
of differentiated comprehension purpc)se attainment, but the same 
factual types of questions were asked for all articles. In the next 
section, it is pointed out that there has been some confusion about the 
function of comprehension measurements in reading flexibility 
studies. 

The claim of both Letson and Levin to have constructed compar* 
able tests of equivalent raw score difficulty to measure the 
comprehension both of materials of unequal difficulty and possibly of 
purposes of unequal difficulty, raises the question about their validity. 
Clearly, the difficulty of" a test must reflect a difficulty of the reading 
passage Itself or the intellectual processes involved in reading for 
different purposes. A solution to this problem would, of course, be to 
construct valid tests for a given purpose or material and to convert 
individual scores to standard scores for purposes of comparison. This 
has not been done often In the literature on reading llexibility. 

Pitcher (1 953) conducted an interesting study of the interacting 
effects of readability and type of material upon rate. His results were 
limited to good college level readers who attained scores of 70 percent 
or above on three sets of ten item comprehension questions. This 
manner of selecting subjects suggests a technique for measuring the 
inllucnce of one or more independent variables upon any dependent 
reader variable with comprehension held relatively constant. Pitcher 
used three types of material— familiar, abstract, and technical— each of 
which included articles written at three levels of readability as 
indicated by the Fleseh formula. He found highly significant differ- 
ences between rates within a given type category for articles written at 
different levels of readability. However, passages with equal read- 
ability ratings did not produce equal rates across type-content areas. 
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This sUuly suggests the importance oF controlling type of material 
when studying the elfects of readability diTferences upon reading rate 
or other dependent variables. Several studies have failed to do this. 

A potentially useful design for experimental purposes was used by 
Nicholaw (1968) in constructing a test to measure the interacting 
effects of readability, subject matter, and purpose upon rate and 
comprehension for sixth grade students. Nicholaw used three subject 
matter materials; literature, science, and social studies. Each subject 
area contained articles written on fourth, sixth, and eighth grade levels 
of difficulty. Two purposes were assigned for students to attain in 
reading a given passage: understanding main ideas and significant 
details. Total tests consisted of eighteen subtests based on passages 
4()()-50() words in length. 

No empirical evidence was provided for reliability, validity, or 
comparability of test questions. Nicholaw's test is probably useful only 
for experimental studies of group differences. Some modification of 
this design might be used in future test construction, although the 
time involved in taking such a test might be prohibitive for practical 
purposes. 

Differences in eye movements have been used often as dependent 
variables in relation to differences in the difficulty of materials. Early 
studies by Judd and Buswell (1922) and Walker ( 1 938) demonstrated 
such a rehuionship, but no statistical tests were used. A more recent 
and more sophisticated study carried out by Taylor et al. (1960) is 
used as an example of this type of measurement. I'aylor used eighth 
grade students of average reading ability who had demonstrated the 
ability to read with adequate comprehension materials designed for 
the Readiw^ Eye Camera. His subjects read different materials while 
their eye movements were being recorded. These materials, from the 
Reading Eye test file, were written at three grade levels— grades four 
through six, junior high, and high school/college. He obtained 
significant differences by use of an analysis of variance test between 
number of fixations, duration of fixations, rate, and comprehension 
for juni' r high versus high school/college materials. An important 
finding of this study, with implications for measurement, was that no 
differences in eye movements were obtained with these sixth grade 
students between fourth and sixth grade materials or between fourth 
and junior high materials. The interpretation given by Taylor was that 
the difficulty of content does not significantly affect habits of reading 
performance when the material is at or below the reader's grade level. 
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It .should be noted thai these H ii dings we ir based on dilTerent content 
which may have varied in interest appeal and background knowledge 
Tor different readers. In any case, the problem of the difficulty level of 
materials used in measuring reading flexibility needs thorough and 
systematic stuciy under care f idly controlled conditions. 

These previously described studies on the effects of material 
difficulty upon various reader behaviors have involved comparisons 
between materials. McCracken's concept (1965) of **internal llexi- 
bility** involving changes in approach and rate within a passage was 
not used in measurement until recently. Humphrey (1957) had 
studied variations in rate per minute to minute within a 7,000 word 
article read by college students, but his technique for obtaining such 
measurements was not explained. In any case, since he did not relate 
these changes in rate to any independent variable, the writer docs not 
consider this to be a measure of internal reading llexibility. A possible 
reason for (he delay in the development of a technique for measuring 
intra-article changes in reader behavior is found in a study by McDonald 
(I960). He determined that both reading rate and comprehension of 
college students with high anxiety, as measured by a personality test, 
were impaired by testing procedures involving periodic interruptions. 
He concluded: 'Timing procedures which produce periodic inter- 
ruptions during the reading process should be avoided." 

In their 19 70 study, Rankin and Hess found that periodic 
interruptions did not impair the reading of college students of high 
anxiety^ as measured by the SA-S Senior Scales when subjects were 
given a practice period to help them to adapt to these testing 
conditions. Using material in the Diagnostic Reading Te>jt: Survey 
Section and the twenty test items in this test, Rankin and Hess had 
students read for the purpose of reading as rapidly as possible with 
understanding. The subjects consisted of members of reading improve- 
ment classes who were selected for these classes on the basis of scores 
falling below the 3:ird percentile on the Cooperative English Tests: 
Reading Comprehension. Subjects were instructed to underline the 
word they were reading when the signal "mark" was given every 
fifteen seconds. Previously, the readability of every successive 100 
word passage had been determined by a comparable group of subjects 
using the cloze procedure on these materials. Rate measurements were 
computed by determining the number of fifteen second interval 
markings used Tor each successive 100 word passage. A flexibility 
coefficient was computed by correlating the rate scores with the 
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readability scores lor cacli passage. This procedure produces negative 
correlations lor stuclcnis who tend to slow down (or more di(Ticult 
passages and to speed up lor easier passages. In the 1970 study, 
Rankin and Hess obtained a correlation Tor ihe total group by using, 
as a rate measurement, ihe average rate for the group on each 100 
word segmeni and correlating this distribulion of average rates with 
l!v.: distribution of clo/e readability scores. Measurements taken before 
und after a one-semester reading course yielded correlations of —.34 
before training and -.48 after training. The latter was significant at 
the .05 level. A pilol siudy showed that ihe average comprehension 
score was approximately 75 percent on these materials under these 
testing conditions. 

Anolher siudy, based uw the analysis of individual tlexibility 
coefficienls for each of the subjects used in the previous investigation 
both before and after training, was carried out by Rankin 
(1970-1971). This study found a wide distribution of llcxibility 
measurements even among these poor readers. Since several studies 
have shown that better readers tend to be more flexible than poorer 
readers, Rankin's results call into question McDonald's previously 
mentioned observations about the lack of reading llexihility among 
readers of all ages. 

This recently developed technique for measuring internal {intra- 
arlicle) reading llexibilily has a number of desirable features. It does 
noi involve the u.se of difference scores with their reliability problems. 
Flexibility coefficients can be interpreted in the same manner as all 
correlation coefficients. Cloze test readability measurements are valid 
measurements of readability for the parlicular population of subjects 
being studied while reading these specific pas.sages. All factors 
affecting the difficulty of each passage for a particular group of 
readers should be reflected in cloze readability measurements. It 
should be iu)led, however, that this recently devised measurement 
technic|ue is strictly an experimental procedure. Until techniques are 
devised for machine scoring and computer conversion of fifteen 
interval markings to the "number of 15 second intervals used per ICQ 
words," the time and effort involved to do this work by hand is 
prohibitive. At prcseiu nothing is known about the reliability of these 
measurements or their relationship to ''external llcxibility." It also 
seems likeiy that the magnitude of the correlations obtained by this 
technique would be inlluenced by the range of both readability and 
rate measurements. 
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huU'pcudcnt variables: rcailcr characteristics. One of the most 
iiiiporlaiil ivadei' cliariHlcristic's used as an iuilcpcnilcnt varial)lL' in 
studies 1)1' reading I'lexibilily lias been purpose. Firsl, studies arc 
considiTed wliirh attempt Id eonlrol purpt)se by givin.^ speeirii* 
direelions to the reader. Following these studies, a few melliods are 
described whieh give the reader more (Veedom in choosing liis own 
purpose. 

A study by Walker (191^8) iound din'erences in the eye movements 
of superior rollege readers amt)ng ilie following assigned purposes on 
passages ol etjuivalent diflK'ulty to be read for general idea, details, 
thorough knowledge, and answering a specific assigned question. 
Although no slalistical tests of thcitc differences were made, the study 
does suggest that assigned purposes may be accepted, at least by superior 
(above 9()th percentile) college readers. In general, lK)wever, the 
question of acceptance of assigned purposes in reading flexibility tests 
is subject to question unless some evidence is produced which 
indicates such acceptance. Previt)usly described investigations by 
LeVson (1959), Levin (19()8), and Nicholaw (19G8) have all attempted 
to measure the effects of assigned purposes on various reader 
behaviors such as rate and comprehension. 

Shores (19(30) used a different method of studying die effects of 
assigned purposes on readers. Using adult level science materials for 
both sixth grade students and college students, he used two assigned 
purposes: read for the main idea and read to remember ideas in 
sequence. Unlike most studies of purpose, Shores attempted to 
determine differences in reading approaches through the analysis of 
written introspective reports, following each reading, concerning the 
manner in which each reader thought he had read the material and how 
each student thought an ideal reader would have read the article. The 
results of this investigation are not of significance from the standpoint 
of this paper, but the method used was important in possibly 
inlluencing later development of reading llexibility tests by Smith 
(19GI, 19G4). 

Smith (I9()l) used the assigned purposes of getting a general 
impression and remembering details in reading two different parts of a 
biographical selection. She also gave readers a general question based 
on the content of the material and a .suggested way of reading it. 
Twelfth grade students were used in this study. Differential test 
questions were used to measure the accomplishment of different 
assigned purposes. Smith also used tape recorded interviews of 
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inlrospcclivc reports on the methods used for accomplishing each 
purpose and the past cxperiem cs dI* readers in reading For different 
purposes. In addition, she determined how well each subject held his 
assigned purpose in mind hy asking each reader to state the purpose 
after he had read the assigned material. Although her results are not of 
primary interest, it is significant that she found that only approxi- 
mately one-half of the poor readers (not defined) could remember the 
assigned purpose immediately after reading. This finding suggests that 
a question on the retention of purpose might well be used as apart of 
tests of reading llexihility using purpose as an independent variable. 

Smith's experience in conducting the previous study probably led 
to the development of an informal Test of Purpose and an 
accompanying Reading Inventory for use by high school freshmen 
(Smith, 1964). The Test of Purpose provided an opportunity for 
students to .select their own purposes as well as to read for assigned 
purposes. Part one of the test consists of twenty- four originally 
written selections of seventh-eighth grade readability about a variety 
of topics of interest to ninth grade students and written so as to be 
appropriate to a particular reading purpose. Passages range from 
l()G-360 words in length. Students are asked to read each selection 
quickly and to choose the most appropriate purpose from u list of 
purposes. 

Part two of the test consists of twelve selections prepared in the 
same manner as described above and designated to be read for a 
specific assigned purpose. The 12 assigned purposes were arrived at by 
starting with an original list of 215 purposes. Space does not allow a 
listing of these purposes, but in the writer's opinion, they constitute 
the most comprehensive and meaningful choice of purpo.scs used in 
any test of reading llexibility. In addition, they are very specifically 
stated; e.g., **You are to read this selection for the purpose of 
understanding sensory images, or forming vivid images or pictures 
from a description (almost being able to see, hear, or touch objects)" 
(Smith, 1964), Five multiple-choice questions specifically designed io 
measure the accomplishment of the assigned purpose were provided 
for each passage. Each passage is timed to obtain a measure of rate. A 
checklist of reading approaches is used to obtain student introspec- 
tions of procedures used in reading. FinaHy, the retention of each 
assigned purpose is determined following each article. 

Two forms of the total test were constructed. Using seventy-three 
high school freshmen, the comparability of both forms was, pre- 
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sumably, altained by evidence of equal means and slandaid deviations 
on bolli parts of each lest taken by the same students. The writer 
questions why correhitions were not also used to study comparability. 
High reliability coefficients were obtained of about .90 for total test 
comprehension scores on both forms. It is strange that no attempt was 
made to demonstrate the reliability of the parts. They were probably 
low due to the use of only five test items. It is precisely the 
comparison of results on different parts of the test which provides the 
measure of reading llexibility. A crude attempt to validate the 
responses made to questions of procedures used in reading was made 
by using eye movement photographs and oral tape recorded retrospec- 
tions. These results were, in some unexplained way, compared with 
initial' lest responses. It was claimed, without documentation, that 
similar responses in both situations gave evidence that students were in 
fact reading for different purposes. 

The Reading Inventory to accompany the Test of Purpose consists of 
a checklist of the fifty- three statements of reading approaches. 
Students checked one of these columns to indicate if they usually, 
sometimes, or seldom read in the way described by the statement. A 
type of reliability cheek was made on a tentative draft of the test by 
interspersing throughout the inventory items meaning the same thing 
but worded differently. Results were not given. The writer wonders 
why a simple test-retest procedure was not used. In some undefined 
manner, eye movement photographs on the Test of Purpose were used 
to validate the inventory. 

Despite some technical inadequacies and lack of norms and 
probably unreliable difference scores on rate and comprehension, this 
test should be used as a model for future test construction concerned 
with the role of purpose in reading flexibility. Also, unlike other tests 
which are supposed to measure changes in approach in relation to 
purpose, these tests were really designed to study the subtleties 
involved in the concept of reading approaches. The comprehensive and 
highly specific statements of purpose in the Test of Purpose are 
unsurpassed in tests or experimental investigations of reading flexi- 
bility. Due to the specificity of assigned purposes, the use of five test 
questions to measure the accomplishment of one purpose suggests the 
possibility for future testing of using criterion-referenced items which 
may be interpreted without reference to normative data. 

Two interesting observations by Hill (1964) may have significant 
implications for the measurement of purpose-related reading flexi- 
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bility and the measurement of reading llexibility in general. Hill 
studied changes of rate and comprelicnsion among college students 
reading three dillerent articles for three different assigned purposes. 
All passages dealt with relatively complex materials on controversial 
social probleins. He failed to obtain significant differences. Although 
he did not ful.y document this interpretation, he suggested that some 
evidence indicated individual interests in reading topics may have 
tended to affect the assigned purposes. A need to control the factor of 
interest has often been ovedooked in the measurement of reading 
llexibility. Hill also allowed a smaller group to repeat the test a few 
days later. He obtained not only in^jreases in rate and comprehension, 
as would be expected, but he also observed a great increase in reader 
interest. He raised the question of why the concept of reading 
flexibility should be restricted to single reading circumstances. Perhaps 
there is a suggestion here for a different measurement procedure 
involving more than one reading. 

An experiment by Grant and Hall (1968) studied the relationship 
between comprehension and the reading achievement level of sixth 
grade students reading for two assigned purposes: 1) to read in order 
to answer a specific, broad, thought provoking question and 2) to read 
in order to answer questions. Important findings with suggestions for 
test construction were-^that, for the best readers who were reading at 
their independent level, there was no significant difference in 
comprehension; the average readers, reading at their instructional 
level, made significantly higher comprehension scores with the help of 
the thought provoking question; the poorest readers, reading at their 
frustration level, made slightly, although not signifi'cantly, lower 
scores on the passage without the thought provoking question. These 
results, like others, point to the importance of establishing the 
appropriate difficulty of material for the readers who will use a test 
designed to study the effect's of other independent variables such as 
purpose. Observations by Henderson (1965), to be discussed, were in 
agreement with respect to the importance of choosing the correct 
difficulty of materials. 

Most studies of the effects of purpose upon reader behavior have 
depended on assigned purposes, with the exception of Smith (1964). 
An experiment by Henderson (1965) investigated individually formu- 
lated purposes for reading among fifth grade students reading very 
easy material at the second grade level. Without discussing the details 
of this rather fully written study, it should be pointed out that a 
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signil'icanl relalioiisliip was lound heisvecn ahilily to sel one's own 
purpose aiul holh j^eneral reading aehieseinent and al)ilil\' to attain a 
purpose. However, a more important I'indiiii^ from the standpoint ol" 
measurement techniques was that no si^^niiieant dirierences in 
eomprehension were Ton ml anions the same sul)jeets reading under 
three eondi lions: 1 ) purpose supplied by the experimenter, 2) 
development ol' own purpose alter reading the I'irsl hall" ol a story, and 
S) no purpose assigned or designated hy the student. Thus, it might 
follow that students who read well lor assigned purposes may also 
read equal I\ well when they (•ho{)se their own pinposc. More study, 
which might inelude rate as svell as eomprehension or other dependent 
variables, is needed on this question. It is not clear from reading this 
article whether diere w^as any speeilie relationship between the 
assigned purposes and the eomprehension test used. Also, the author 
noted that the reading material was written at the independent level 
for these subjects. He suggested that the relationship between skill and 
purpose setting and reading comprehension might increase with the 
use of more difficult material. 

A method used by Bloomers and Linclquist (1944) has much to 
rec{)mmend it as a more adequate technique for allowing the reader to 
make a choice {)f rate and approach in order to attain an assigned 
purpose. This technique involves setting a specific purpose lor each 
passage by providing a c{)nlenl-relaled question before each passage 
and directing readers, . . to read at a rate which seemed to them 
pers{)nally the most efficient for the accomplishment of the purpose 
set.*' This would appear to have some advantage over the usual 
admonition to read **as rapidly as pt^ssihle/' 

As previously discussed in relation t{) Raygor's Reading Test, the 
techniques used in this test provide some freedom for each individual 
to choose his own purpose in the light ol a brief description and the 
nature and difficulty of material. 

More work is needed on the study of dilTerent ways of {)l)Serving 
the effects of purpose on reading performance. In the writer's opinion, 
if a specific purpose is assigned on a test, some attempt should be 
made to ascertain whether or not the purp{)se was accepted or 
remembered. Also, more studies need to be carried out which allow 
students to determine their own purposes in testing situations. 

Reader familiarity with material is another variable which needs to 
be controlled in the study of reading flexibility. So many studies point 
to the effect of previous knowledge upon scores of comprehension 
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Icsls taken aClcr reading (Irion, 1925; Arlley, 1944; Chall, 1947; 
Robinson, 1947; Dixon, 1951) that it is not necessary lo describe 
studies cm this obvious fact. M(;re recently Preston (1962) has 
demonstrated that students answer questions beyond chance expecta- 
tions without reading the passage upon which these tests were based, 
even on a standardized reading test. 

If the effect of previous knowledge upon comprehension and/or 
rate test results is not controlled, then the differences between trsl 
performance on different passages cannot be interpreted as dependent 
variables which change as a function of variation of independent 
variables in the measurement of reading tlexibility. In some story-type 
materials which are new to almost all readers, this factor may be of 
little consequence. Even with these types of materials, some readers 
more than others will have greater familiarity with the topic or style 
of the story. As the content of the test material becomes more closely 
related to specialized areas of experience such as subject matter fields, 
the likelihood of contamination of test results from previous 
knowledge increases. 

With the exception of a test like the Braam-Sheldon Flexibility of 
Reading Test, familiarity of material is more important as a control 
variable than an independent variable in the measurement of reading 
nexibility. Current interest in constructing tests with reading-related 
items like the Carver-Darby Chunked Reading Test should speed up 
progress toward the development of reading tests less influenced by 
background knowledge than present tests. 

There has been some speculation about different personality 
variables in relation to reading flexibility* However, people such as 
McDonald (1963) and Berg (1967) have been concerned with 
correlates of reading flexibility, such as psychological set or emotional 
freedom, which are not used as independent variables in the 
measurement of reading flexibility. Laycock (1958) studied the 
relationship of a personality variable to reading flexibility, but this 
study used personality as a. correlate and not a variable in the 
measurement of reading flexibility. 

Instructional set for reading rate has been used as an independent 
variable in reading flexibility measurement. Maxwell (1964) and 
I^ycock (1955) found that college students could make significant 
improvements in reading rates, with adequate comprehension, by 
following simple instructions to read materials as fast as possible 
without loss of comprehension. These findings challenge the claim by 
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McDonald el al. (19()8) that readers cannot change their reading rale 
at will. McDonald's point of view is based on a number of studies 
involving subjects of dilferent ages using a variety of tests which 
involve a number of confounded variables. On the other hand Taylor 
(I960), in- a well controlled study of the ability of eighth grade 
students to change reading rate under dilTcrcnt instructional sets, 
found significant differences in rate but with a significant loss in 
comprehension. At present, the evidence about the inlluence of 
instructional set upon reading performance is conllicting. I'here is a 
need for replication of studies concerning the effects of the same 
instructional sets upon reading performance of readers of comparable 
ability using the same materials, 

Bernstein (1955), Taylor (1 960). and Bryant and Barry (1961) 
conducted studies on the influence of reader interest upon reader 
behaviors. Bernstein used two articles of equal difficulty, upon which 
significant differences in interest ratings had been determined for 
ninth graders, and an extensive comprehension test of thirty ques- 
tions. She obtained a significant difference in comprehension in favor 
of the more interesting article, Bernstein's passages had been inten- 
tionally written to vary greatly in interest appeal. On the other hand 
Taylor (I960), u.sing Reading Eye materials upon which significant 
differences in interest ratings had been found for college subjects, 
obtained no significant differences in eye movements, rate, or 
comprehension between passages. Similarly, Bryant and Barry (1961) 
found no significant differences in rate or comprehension between 
two articles read by two different groups of readers whose ratings 
indicated a perference for one article over the other. These articles 
were not very different in nature, and no indication was given that 
there was any extensive difference in interest value for these articles. 
In any case, such studies as these do not really measure reading 
flexibility. Neither losses nor increases in rate or comprehension based 
upon materials of various degrees of interest appeal necessarily 
indicate a desirable outcome of reader behavior in relation to changes 
in an independent variable. Interest appeal is an important control 
variable in the measurement of reading flexibility. There is no doubt 
that sufficiently great difference in interest appeal would affect 
various kinds of reader behavior, 

A few other personality variables have been used as independent 
variables in relation to reader behaviors. Gifford and Marston(1966) 
studied the relationship between test anxiety, reading rates, and task 




36 



The Measurement of Reading Flexibility 



experience in an experimtMUal study involving dilTerent groups of 
fourlli grade hoys. IlalT of iKe subjects received a practice te^a w^hile 
the other hall' read the passage only once. Subjects read the passage 
Tor the purpose of getting main ideas or for re!ncml)ering details. 
Differences were found in reading rate in favor of the subjects with 
low anxiety under no practice conditions. However, after practice, 
these differences disappeared. This significant finding is similar to the 
results i)f Rankin and Hess (iy7()) about the value of practice on 
reading performance of high-anxiety students. It also bears some 
relationship to HilPs suggestion (1964) about the value of rereading in 
relation to reading flexibility. 
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Areas of agreement and positive findings 

Concepts of reading flexibility 

Although no attempt was made to review all of the definitions which 
have been formulated for ''reading llexibility,*' it can be concluded 
that most definitions were in agreement that reading flexibility 
rcllects the ability of a reader to change some aspects of his reading 
skills in order to attain a variety of reading purposes or to read 
different kinds of materials with good comprehension. The purposes 
assumed in most definitions involve different aspects of comprehen- 
sion. Most definitions included some reference to the adjustment of 
rate in an efficient manner suitable to the reading task. 

Various definitions were, therefore, in agreement that reading 
llexibility involves not merely changes in reading processes, but 
changes in a desirable direction in order to cope adequately with the 
demands of various reading tasks. It follows logically that measure- 
ments of reading flexibility necessarily involve observations of 
intraindividual changes in behavior in response to various circum- 
stances. 

The differences in definitions point to an important conclusion; 
namely, there is no one entity adequately described as reading 
flexibility. Instead, there are different kinds of reading flexibility. This 
conclusion has important implications for research, theory, and the 
development of measuring instruments. 

McCracken's distinction (1965) between internal flexibility and 
external llexibility was an important one which has influenced 
subsequent development of a measurement procedure to measure 
internal llexibility. 

Techniques of measurhig reading flexibility 

Published tests. It is much easier to find differences than similarities 
among the few published tests of reading flexibility. All four, tests 
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make provisicMis for ihc iiuluc-cmcMit of purposeful reading on all 
subparts. Willi ihe exception of the Klexil)ilily of Reading Test by 
Braani and Sheldon, these tests attempt to have the reader read i'or 
several different comprehension related purposes. The purposes vary 
slightly, but they all provide an opportunity for the reader to read for 
tht)roiigh comprehension and to read for a nit)re general understand- 
ing. Again, with the exception of the Braam and Sheldon test, all tests 
contain subtests v^hieh purport to measure the reader's ability to skim 
for main ideas and to scan in order to locate the ansvver(s) to one or 
more questions presented prior to reading the passage. 

All tests, with the exception of the Test of Reading Flexibility by 
Spache and Berg, include materials of different degrees of readability, 
content, interest appeal, and novelty. These procedures reflect an 
omnibus general concept of reading (lexibility which involves the 
effects of a large number of interacting variables upon reader 
behavior. 

All of the published tests measure differences in reading rates 
obtained on different reading iasks. They also provide measures of 
comprehension for each task. Comprehension on these tests is 
measured by use of objective items which require a choice among 
alternatives. Most items are of the multiple-choice variety. Test items 
are used either in order to make it possible to interpret the rate scores 
or to measure the attainment of some purpose. 

Each test provides some type of criteria to enable the reader to 
interpret his rate and comprehension scores* With the exception of the 
Test of Reading Flexibility by Spache and Berg, each test also 
provides criteria for the interpretation of differences in rates under 
various reading task conditions as an indication of reading llexibility. 

Each test includes reading materials selected as suitable to the 
readhig level and interests of the reader for whom the tests were 
written. The length of the passages are substantial on all tests. 

All comparisons of rate and comprehension scores of the published 
tests of reading llexibility are based on data obtained after reading an 
entire passage rather than data obtained in the process of reading a 
passage. Thus, they all measure what McCracken (1965) terms 
external flexibility. 

The Test of Reading Flexibility is the only published test which is 
designed to measure the effects of one independent variable (i.e.-, 
purpose) upon reading rate and comprehension. Other factors which 
might affect these reader behaviors are held constant while purpose is 
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sysiemalifully varied. I'his measiuenient design i.s highly commendable 
in llial il provides results which tan be easily inter[)reled. Another 
desirable feature of this test is that it does not attempt to use 
difference seores, as sueh, as criteria for reading flexibility. Instead, 
separate norms were provided Tor rate and comprehension. This is one 
means of avoiding the difficidiies involved in obtaining suitable 
reliability on difference scores based on correlated measurements. 

The Flexibility of Reading Test by Hraam and Sheldon, although it 
possesses a niunber of technical deficiencies, nevertheless serves as a 
model for tlie measiu'cment of a different kind of flexibility using 
subject matter as an independent variable. 

I'hc authors of both the Reading Versatility Tests and the Reading 
Test made commendable attempts to construct comprehension tests 
based on reading related items not sid)ject to being answered by 
readers who had not read the passage. 

The Readiiig Test by Raygor contains a desirable innovative feature 
in its skimming and scanning subtest. This siducst contains a variety of 
different materials such as indexes, charts, bibliographies, and text- 
book excerpts. No other publislied test uses this technique which 
closely resembles th'^ types of materials that students would be likely 
to use in skimming and scanning in study-type reading. This subtest is 
also unique in providing a large number of items to measure the 
students' comprehension while skimming and scanning, 

Expcrimenial measurements. Most unpublished tests and measure- 
ment procedures used in research studies have attempted to measure 
the specific influence of variations in a particidar causal agent, such as 
reading difficidty or purpose, upon changes in several reading 
behaviors. This type of test design has produced test results which can 
be interpreted with some precision. When an attempt was made to 
study the effects of more than one causal agent, the relative 
contribution of each factor to changes in behavior wus measured^ 

A survey of the literature has revealed that experimental tests have 
studied the independent effects of many different causal agents upon 
many different reader behaviors. This is in great contrast to published 
reading tests. Examples of independent variables studied include the 
effects of content, style, familiarity with materials, instructional set 
ioY rate, interest, a wide variety of reading purposes and different 
ways of assigning them, the ability of stiuletUs to determine their own 
purpose, and personality variables. Different reader behaviors studied 
include rate, general comprehension, eye-movements, introspective 




40 



llic Measurement of Reading Flexibility 



reports of awareness ol reading processes, retention ol purpose, 
interest, and comprehension as allainmcnt ol" a specific purpose. 

There was considerable agreement in the literature that differences 
in the difficulty of material influence changes in u wide variety of 
reader behaviors (Judd and Buswell, 1922; Walker, 1938; Pitcher, 
1953; I.etson, 1959; I'aylor, 19G(); Levin, 19(i8). 

There was genera) agreement in most studies that different purposes 
influence changes in various reader behaviors (Walker, 1938; Shores, 
1960 and IQoT; Smith, l9G4;Granl and Hall, 1968; Levin, 1968), 

Several studies have yielded results which emphasize the importance 
of the general f- .el of reading difficulty used in tests of llexibility for 
readers at a given reading achievement level. There is some indication 
that material written at the' reader's independent level is not suitable 
for measuring reading llexibility (Taylor, I960; Henderson, 1965; 
Grant and Hall, 1968). 

Results of several studies point to the conclusion that familiarity 
with materials inlluences reader behaviors (Irion» 1925;Ardey, 1944; 
Chall, 1947; Robinson, 1947; Dixon, 1 95 1 ; Preston, 1962), 

Gifford and Marston (1966) and Rankin and Hess (1970) have 
found evidence that pretest practice (not just a few simple items as in 
most tests) prevents some test procedures from penalizing readers with 
a high test-measured anxiety level. 

Several findings from individual studies suggest a number of 
important implications for future research and development of 
measurements of reading llexibility. It should be noted that replica- 
tion of these studies is needed in order lor great confidence to be 
placed in them. 

Letson (1959) found that differences in material difficulty pro- 
duced greater differences in reading rate than did differences in 
assigned purposes. He also observed that some readers have negative 
flexibility; that is, a tendency to vary rate in the opposite direction 
than would be expected from differences in material difficulty and 
purpose. 

Pitcher (1953) found that differences in style interact with 
differences in reading difficulty in producing changes in reading rate, 

Rankin and Hess (1970) have developed a new procedure for 
measuring intra-article reading llexibility that has potential as a 
research technique, Rankin (1970-1971), using this procedure, ob- 
tained evidence of greater reading llexibility, even among poor 
readers, than previous research on reading flexibility had indicated. 
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Smilirs rinding (I9()l ) thai only onc-lialfol* a group of poor readers 
among Ivvclflh <j;ra(le stiulcnts could remember the assigned purpose on 
passages in a lest immediately after reading a passage is of 
significance lo lesl constructors. Her I9()4 test contained some 
excellent techniques for studying reading approaches in relation to 
many iliffcrenl assigned purposes. This test provides an excellent 
model for the study of purpose. 

Hill (I9()4) found evidence that differences in students' interest in 
materials had iulluence upon the effects of assigned purposes. He also 
obtained data that suggests that the concept of reading llexibility not 
be limited to materials read for the first time. He found that rereading 
a passage produced beneficial effects upon reading performance and 
interest. 

Henderson's study (I9()5) suggested that readers who read well for 
assigned purposes will also read well when choosing their own 
purposes. 

The study by Bloomers and Lindquist (1944) used a different way 
of stating an assigned purpose that has much to recommend it over the 
usual directions *'to read as rapidly as possible in order to imder- 
stand." The reader was told, after being presented with a content- 
rehitcd question, to read at a rate which seemed the most efficient for 
the successful accomplishment of the purpose. 

The technique used by Raygor (1970) of describing the general 
nature and difficulty of selections and allowing students to choose 
their own purpose in the light of this information, is an innovative 
feature which might be emulated by other test constructors. 

Critique 

Concepts of reading flexibility 

Various concepts of reading llexibility, as revealed in a variety of 
definitions, differ mainly with regard to the function of reading rate. 
Some authorities (Carrillo and Sheldon, 1952; Braam, 1 963) have 
defined reading flexibility in such away as to indicate that changes in 
rate per se, in relation to different purposes or materials, will bring 
about adequate comprehension. McDonald (19(53, 1965, 1967) and 
others have defined the concept so as to regard rate changes as the 
result of utilizing different reading approaches in relation to various 
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purposes and material characteristit s, so that adequate comprehension 
is attained. 'I'his crK'ln)UTmy is confusinj; to researchers, lest construc- 
tors, and teachers. It is quite understandable that dilTercnt authorities 
should give diricrenl definitions of a concept. However, if this 
particular issue could be resolved through research stemming from 
well rorniulaled theory or even logical analysis of known reading 
tasks, it would clear up a lot of confusion about tbe concept of 
reading ncxihilily. 

In the writer*s opinion, the chief deficiency in virtually all 
definitions of reading flexibility is the restriction of the meaning of 
purpose to aspects of co'mprchension. Even Stauffer (1962), who 
defined the concept as efficient and satisfactory attainment of 
purpcjse, proceeded to discuss examples which were restricted to types 
of comprehension. There are legitimate purposes for reading o<her 
than comprehension which might be included within a broader 
concept of reading llexibility, 

Teeltniqties of measuring reading flexibility 

Many of the deficiencies of published and experimental measurements 
are similar, hence no attempt is made in this section to organize the 
critique into these two subdivisions of published and experimental 
mcasuremcnls. Instead, both types of measurement arc discussed in 
relation to a given topic. 

Confounded variables. In many tests of reading flexibility, the test 
is designed so that the differences in reader behavior cannot be 
interpreted as the result of variation in a single variable such as 
purpose or difficulty of material. Instead, changes in behavior as 
measured by the tests may be due to an unknown combination of 
factors such as interest in materials, style of materials, content of 
materials, difficulty of materials, familiarity with materials, and 
purpose. Ihis lack of control makes flexibility scores difficult to 
interpret. Since two identical scores may have different meanings for 
two individuals, the test results are of little diagnostic value for 
teachers. The confounding of variables is more characteristic of 
published tests than of experimental measurements which often 
measure the independent and relative influence of two or more 
variables. 

Difference and ratio scores. In most measurements, reading flexi- 
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bility is measured by differences in a reader*s behavior (or ratios based 
upon differences) under iwo or more circumstances. This measure- 
ment procedure is apt to result in difference scores with low reliability 
unless the two measures are not highly correlated and each of the two 
measurements have high reliability coefficients. This problem is more 
acute for tests which were designed to measure individual differences 
than for experimental tests designed lo measure group differences. 
Even in the latter case, difference scores with low reliabilities may 
result in findings of nonsignificant differences between group means. 
The writer has not found a single published test or experimental test 
which has given any evidence of sufficiently reliable difference scores. 

The use of difference scores (or ratio .^icorcs) is also questionable on 
the grounds that a person whose lowest score is near the top of the 
distribution may not have as t:qual an opportunity to display a large 
difference score as a person whose lowest score is near the bottom of 
the distribution. Whether there is such measurement induced inter- 
action is not known. 

Technical characteristics. Manuals for published tests and research 
studies using experimental measurements have often provided either 
little information or questionable information about such things as 
reliability, validity, norms, comparability of subtest results, or 
comparability of all test forms. 

Reliability coefficients are often missing (or lacking) for either rate 
or comprehension measurements in reading flexibility tests. Although 
it is not the writer's intention to carelessly or unjustly level 
accusations, several practices seem to falsely create the illusion of 
satisfactory reliability. For example, some published tests have 
combined two subtests in computing reliability, Some tests have 
inflated reliability coefficients by using Kudcr-Richardson reliability 
coefficients for speeded tests. Some tests are accompanied by 
reliability figures for the total test but not for the subtests, which are 
crucial to the measurement of flexibility. One publi'^hed test manual 
presents reliability coefficients for several tests, but not for all test 
forms in the battery. One published test is used despite a study which 
found that the reliabilities of its subtests were inadequate for both 
rate and comprehension. 

Test validity is often ignored altogether. It is interesting that the 
writer has found not one study of concurrent validity using two 
different tests of reading flexibility. Often undocumented claims were 
made for content validity. One test used another published reading 
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test as a validity criterion Tor a llcxibility of skimming and scanning 
subtest, even though the other test contained no measures of either 
set of skills. The test manuals and/or published studies about some 
tests contained some vague references to eye-movements as validity 
criteria, but presented no empirical evidence to show^ the specific 
results of this procedure. 

Criteria for interpretation of rate, comprehension, or rate difference 
scores have been given often without any indication of how these 
criteria u^ere obtained. Some criteria w^ere apparently arbitrary. In 
some instances, norms were presented without any description of the 
normative sample. 

Several tests have used comprehension subtests constructed to yield 
equal scores even though the tests were measures of performance on 
tasks of unequal difficulty. This practice raises serious questions about 
their validity. In general, then, most reading llexibility tests either do 
not meet the technical criteria for good test construction, or their 
constructors simply have provided no information about such matters. 

Comprehension measurement. The confusion about the different 
functions of comprehension tests in the measurement of reading 
flexibility is discussed in relation to the writer's model of reading 
flexibility, Hou^ever, it should be noted that comprehension tests were 
often used which are not adequate measures of the attainment of an 
assigned purpose. Another common fault is the construction of tests 
which are too short to have adequate reliability. The relationship of 
the comprehension measurement and the reader's rate is not clear in 
many tests. As Farr (1969), has indicated, the practice of forming a 
rate-of-comprehension index by multiplying rate times comprehension 
percentage is not justified. Some general comprehension tests allow 
the reader to refer to the article which answers questions, while others 
demand recall of information. Only a few test constructors have 
attempted to use reading related items which would be free from the 
influence of previous information on their comprehension tests. In 
general, there has been a tendency to construct comprehension tests 
which emphasize the measurement of lower level aspects of com- 
prehension. 

Rate measurement. Test constructors have used a variety of 
techniques in measuring rate. Some tests have used time limits in 
measuring rate while other tests have been untimed. Some tests have 
instructed the reader to read as rapidly as possible, while others have 
instructed the reader to read at his normal rate or at a rate suitable to 
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attaining the assigned puq)()se. One test measures rate based upon 
words read (hiring the first three minutes, while must others aUow a 
h)nger time. Some rate measurements were based upon the reading 
time before answering questions, while others included the time taken 
in answering question in computing rate. Such a variety of measure- 
ment practices call into question the comparability of different 
reading (lexibility test results. 

Purpose, Statements of purpose have, in most measurements, been 
limited to general types of comprehension components like remember- 
ing details or getting the main idea. Few studies have attempted to 
determine the extent to which assigned purposes are accepted. 
Purpose statements arc many times vague and general and give the 
reader little help in adjusting reading approaches to purpose attain- 
ment. At present, it is not known whether it is best to assign specific 
purposes or to allow readers to choose their own purposes in the light 
of some information about the material. Some assigned purposes like 
*Vead to understand the article completely" probably serve to confuse 
the reader. Some statements of purpose include suggestions for 
attaining it, while others do not do this. Some assigned purposes are 
related to the content of the passages, while others bear no 
relationship to content. Such a variety of ways to assign purposes 
makes it difficult to compare the results of different reading flexibility 
tests. Indeed, interpretation of single tests are useless without careful 
definition of assigned purposes. 

Readability of materials. The difficulty of materials used in a test of 
reading flexibility has a crucial bearing upon the results. Many 
published tests are not accompanied by sufficient information in the 
manual about this important factor. In contrast, most informal 
measurements do give the reader this information. The precise effects 
upon the measurement of reading flexibility of the readability level of 
materials in relation to the reading achievement level of the reader are 
not known. Even the effects of differences in readability between 
passages upon flexibility measurements are not known. The failure of 
some tests to find evidence of much reading flexibility in the 
population may be due to small differences in readability between 
different passages in the test. 

External flexibility. Virtually all reading flexibility measurements 
of the effects of material difficulty upon reader behaviors have been 
limited to comparisons of results obtained on two or more complete 
passages. Very little is known about changes in adjustment of reading 
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approaches while a person is reading a single extended passage 
(internal flexibility). 

Negative flexibility. The tendency of some people to slow down for 
easy passages or purposes and to speed up for more dilTicult passages 
or purposes has been observed in only one study. Little is known 
about this phenomenon. 

Effects of instructional set. Conflicting evidence in the literature 
and competing claims of various authorities about the effects of 
instructions lo read faster leave this issue undecided. 

Effects of interest. Again, conflicting evidence on the effects of 
interest on reader behavior makes for uncertainty in test construction. 
It is not known how much difference in interest appeal two articles 
must have in order to produce a significant effect upon the reader's 
performance. 

Selection of dependent variables. A wide variety of reader behaviors 
have been used in the measurement of reading flexibility. Such usage 
is related to the different concepts of flexibility previously discussed. 
It is not known, however, if a causal agent which produces a change in 
one reader response will produce similar changes in other reader 
responses. 

Generalization of findings. There are many different ways of 
measuring reading flexibility which involve different pairings of 
independent and dependent variables, and which are based upon tests 
of unknown validity. These tests are taken by readers of different ages 
and reading levels, and are rarely replicated. Most generalizations 
about reading flexibility must, therefore, be interpreted with caution. 
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The purpose of this coni liiding section is U) make recommendations 
for future rescarcli and development on the measurement of reading 
llexibility, l>ased upon the findings of this paper and a proposed 
reading model, which should result in clearer conceptions of the 
nature of reading ne\ibility(ies) and more valid and useful measure- 
ment instruments. First, a model is presented which is designated to 
clarify coneepuiaiizations of different kinds of reading llexibility 
designated by ihe choice of the independent variable(s). Then designs 
are suggested for future research and development involving different 
control variables and dependent variables. After a discussion of the 
implications of the proposed model, a number of recommendations 
which stem from the review of literature are made for improvement of 
the measurement process. 

Model of reading flexibility 

This model presents the general ccmception that reading flexibility 
involves an individual reader's ability to make desirable adjustments in 
reading approaches in order to enable him to attain one or more 
legitimate reading purposes under different conditions. These condi- 
tions arc related to differences in materials, differences in his own 
psychological state, or differences in the external environment. It is 
assumed that the unobservable changes in approach can only be 
measured by studying observable changes in the reader's behavior in 
relation to changes in one or more independent variables. The term 
''legitimate reading purposes" designates a purpose which can reason- 
ably be expected to be attained through reading. 

Implications of the model 

Given these considerations, it follows that only intraindividual changes 
in behavior constitute valid dependent variables for the measurement 
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Tilblc 1 GKNEUALIZKDMODllL FOR READING FLEXIBILITY 



Independent (orcontiol) 
Variables 



Approach 
Variables 



Reader Behavior 
Variables 



Materials 

(liiler- or Inira-Piissagt-) 



Perceptuui, Cogni- 
live, or At Tec live 
Processes Involved 
in Reading Purpose 
Altainineni 



Rate, Comprehen- 
sion, Retention, 
I^valuution, Appli- 
cation, Introspec- 
tive Reports of 
Appreciation, 
Approach, or Fur- 
pose 

(Intra-Individual) 



Readers 

(hura-hulivicluai) 



(Intra-Indiviclual) 



Ex tertial Eiiv iro nt tie fit 
(Inier-Situatit)n) 



of reading flexibility per se. Of eoursc, ail changes in reading 
approaches take place within the individual, and may, under suitable 
circumstances, result in behavioral changes. Independent variables 
include manipulated changes cither between or within materials, 
within the reader himself, or between different environmental 
circumstances. These measured changes in behaviors may be observed, 
in the classical design, when only one independent variable is changed 
and all other relevant variables are held constant. Any given variable in 
the independent variable column may be, for one purpose, systemati- 
cally changed as a causal agent or, for another purpose, serve as a 
control variable which is held constant. In multivariate designs, the 
effects of changes in several independent variables upon changes in 
reader behaviors can be studied with suitable statistical techniques. 
These observations, which have traditionally been made with reference 
to ''experimental designs," have important implications for the 
measurement of reading flexibility. T'his is true even though precise 
control is lacking in the form of a test. 

It is suggested that a more precise conception of reading llexibility 
would be attained if one or more verbal labels were added to the term 
'^reading flexibility" based upon the specific independent variable{s) 
used in its measurement. For example, one might measure material 
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dilTiculty rioxibility, purpose Hcxibility , noise level llcxibility, or 
niiUcrial dilTiculty plus purpose flexibility. When two or more 
independent variables are used, the ineiisuring process should be de- 
signed Lo compare their dif rcrential ei'rccLs upon reading behaviors. More 
precise designations based upon well-designed measurement pro- 
cedures would improve conceptualizations of reading Hcxibility, 
Greater precision oi measurement and communication about test 
results and research Findings would result. Improvements should be 
circular and cumulative among concepts, measurement, design, and 
communication. 

Examples of material variables might include readability, style, 
sui)jecL content, and typographical features. The writer would include 
under reader variables several Features which have not been considered 
or measured as a type of reading nexibilily. The concept ot purpose 
might be expanded beyond conventionally measured comprehension 
purposes (understanding main ideas, remembering details) to include 
any legitimate purpose fur reading, the outcomes of which could be 
measured. For example, reading for appreciation, reading for applica- 
tion, reading for memorizing a role in a play, reading for long range 
retention, and reading for evaluation, could all be used appropriately. 
Of course, as a measure of flexibility, the effects of two or more 
purposes would have to be compared. Previous knowledge about the 
content of a passage should, in most measurements, be held constant 
through pcissage selection as a controlled variable. Otherwise, the 
effects of differential reader knowledge should be taken into accoimt 
in the measurement process. It should be possible, however, to vary 
this factor to determine the extent to which a reader can adjust his 
approach. It should be done in such a way as to minimize 
comprehension loss under conditions when background experience is 
bicking. Other reader variables, some of which might be used as either 
independent or control variables, are interest in topic, instructional 
set, fatigue, and mood. The last major category in tlie independent 
variable column is labeled ''external environment." This, also, is 
usually supposed to be held constant while measuring other types of 
reading flexibility- However, the model would indicate the possibility 
of extending the concept of reading flexibility to include measuring 
changes in reader behavior under vari<ms environmental conditions 
such as noise level, presence of other people, and visual distractions. In 
summary, this model suggests both a more precise operational 
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definition of reading flexibility and an expansion of the eoncept to 
iiu liule a variety of independent variables not usually included. 

A complete imderstanding of the many perceptual, cognitive, and 
affective components of reading approaches is lacking at present. 
These processes would include any and all internal processes which 
take place in response to a given reading task as a means of attaining 
the purpose of that task. Such processes as perceptual discriminations, 
perceptual clo/ure, synthesis, analysis, induction, driiwing conclu- 
sions, comparisons, study skills, word attack skills, emotional re- 
sponses, and moiivational state would be involved. Only a relatively 
general idea of the variables involved in reading processes can, at 
present, be attained from behavioral evidence. The use of criterion 
referenced test items shoidd shed light on this complex problem area. 

Either one or several reader behavior variables might be used as 
empirical criteria for measuring the outcomes of changes in one or 
more independent variables. These might include such measured 
outcomes as rate, comprehension, eye-movements, long range reten- 
tion, evaluation, application, introspective reports on appreciation, 
awareness of reading processes, or acceptance and/or retention of 
purpose, or ability to formulate an appropriate purpose. Generally, 
any measurable behavior may be used as a dependent variable 
provided that it is appropriate to a given concept of reading llexibility 
and is sensitive to changes in the independent variable(s) being used in 
the measuring process, 

A note should be added about the role of comprehension as a 
product in llexibility measurement. One use of this variable is as a 
measure of purpose attainment. For example, if different assigned 
purposes included recall of details, interpreting author's intent, or 
drawing conclusions from evidence, then appropriate criterion refer- 
enced comprehension questions should be used to measure the 
reader's success in accomplishment of purpose. In other studies, 
comprehension questions might serve merely as a means of ascertain- 
ing the validity of rate scores. As such, these test results would not be, 
in effect, actual dependent variables as the term is used in the model. 
Still another use of comprehension test questions is to elicit in the 
reader a serious **mental set'' which will increase the probability that 
he will cooperate in the measurement proceedings. Again, such a use 
of comprehension test results would not constitute a dependent 
variable in this model. The concept of reading flexibility stemming 
from the model is essentially positive in that it views reading 
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(lexibility as a desirable outcome of learning. Therefore, comprehen- 
sion consistency (at a level suitable lo purpose) rather than variation 
in comprehension, despite changes in independent variables, eould be 
an indication of reading flexibility. 

This model suggests a general concept of reading flexibility in terms 
of a research design model. The model specifies the necessity for 
greater precision in both conceptualization and measurement pro- 
cedures. It suggests an expansion of the concept which should 
stimulate new and different kinds of research and development of 
measurement procedures. As a resv^arch design model, it points to the 
kinds of control variables which should be utilized in measuring 
reading flexibility in the future. 



Other recommendations 

This concluding section is based on evidence from both positive 
findings and critical evaluations of the literature on the measurement 
of reading flexibility. It is divided into two categories: 1) suggested 
procedures for improving measurements and 2) suggested research on 
measurement. 

Suggested procedures for improving measurements 

Confounded variables. Tests of flexibility should be constructed so 
that the independent effects of separate independent variables upon 
reader behaviors can be determined. Tests constructed by Pitcher 
(1953), Letson (1959), and Levin (1968) could serve as basic models 
for such tests. One passage read for different purposes, as in the Test 
of Reading Flexibility by Spache and Berg, could serve as another 
model .suitable to the attainment of this purpose. Results of these 
kinds of tests would be of diagnostic value to teachers, because the 
teacher would know what factor was responsible for a student's lack 
of reading flexibility. 

Difference and ratio scores. Difference scores as measures of 
flexibility should not be used unless the difference between the two 
scores exceeds the standard error of measurement of a difference for 
the two tests. A formula for the standard error of measurement of the 

difference, denoted SniDiff(l-2)» = V^m i ^m2' where ^ is the 
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sliindard error i>r mcasurcmcni for one test and S^^ the standard 
error of measurement for the other ( Thorndike and Ha,^en, 1961 ). 

Future tests of reading I'lexihility should indicate evidence of the 
reliability ol any difference scores used as me is u res of flexibility. The 

formula for this is: r^iff = ^ , where r, | is the 

1 - rj 2 

reliability of one measure, r2 2 the reliability of the other measure, 
and r,2 is tlie correlation l)etvveen the two measures (Thorndike and 
Hagen, 19()1). From this formula, it can easily be seen how the 
correlation between measures reduces Vhc reliability of Vhe difference 
scores. 

Although it may not be apparent, the previously stated formula for 
the standard error of measurement of the difference does reflect the 
induence of the reliability of the difference, denoted r^iff since the 

^^mUiff(l-2) ~ '^l)if'f(l -2) V l*-rniff.* Proof of this was provided by 
Dr. Frederick B. Davis of the University of Pennsylvania in personal 
correspondence. 

The problems of interpreting ratio scores and measuring their 
reliability are so complex (Cronbach, 1941) that it might be best to 
substitute difference scores for ratios based upon differences. 

The use of correlation coefficients as measures of flexibility as used 
by Rankin and Hess (1970) is one way to avoid the problems of 
difference scores. Another way of measuring Hexibility without using 
difference scores is to provide separate norms for raie and comprehen- 
sion (or other variables) as was done by the Test of Reading 
Flexibility by Spache and Berg. 

Technical characteristics. Appropriate reliability coefficients should 
be provided for all measurements used in testing reading flexibility. 
Also, steps should be taken to improve reliability. The following 
suggestions by Jongsma (1 97 1) should be helpful in attaining this 
objective: provide clear instructions, control external conditions, use 
long reading passages, use longer tests, and follow the technical criteria 
for test constructiur. 

The validity of a reading flexibility test is not to be taken for 
granted. Evidence of validity such as content validity, concurrent 

♦The slundurd error of the difference, denoted Si)iff(i.2), reflects the amount of 
difference between Iwo scores attributed to differences in both "true score" and "error 
score." The standard error of !hc measurement of a difference, denoted Sj^^j^jfj-^ j.2), reflects 
the amount of difference between scores attribuled to differences in "error score.** 
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validity, predictive validity, or ronsiruct validity should be provided 
lor every measurement of reading flexibility. 

Comparability of tests of eomprchension based upon tasks of 
unet|ual dilTiculty should be accomplished not by trying to construct 
tests of equal difficulty, but by converting raw test scores to derived 
scores such as standard scores. Thus, tests results could be compared 
even though the raw scores are not equal in difficulty. 

Adequate norms should be provided lor all published reading 
tlcxibility tests with respect to rate, comprehension, llexibility 
measurement, or any other measurements used. Norms for diderencc 
scores might start with minimal differences which exceed their 
standard error and then provide percentile, or some other norms, to 
differences greater than this iiiinimum. 

Comprehension measurement. Constructors of reading llexibility 
tests should be familiar with the various functions of such tests in the 
measurement of reading flexibility. (See discussion in relation to the 
writer's Reading Flexibility Model.) Comprehension tests must suit the 
purposes for reading each passage. The use of criterion referenced 
items in certain types of flexibility measurements would be desirable 
and would remove the necessity of using norms for interpretation. 
Efforts should be made to construct tests with reading related items 
that could not be guessed by persons not reading the passages. Tests of 
comprehension should reflect many different aspects of comprehen- 
sion, provided that this kind of test is suitable to the reader's purpose. 
Comprehension is best measured in power tests without time pressure. 

Rate measurement. Rate measurements in reading llexibility tests 
should be based on a sufficiently long period of time to obtain a 
reliable measurement. With the exception of scanning tests, it is best 
not to confound rate and comprehension test measures by including 
the time taken to answer questions in the timing procedure. The 
directions given to the student related to rate of reading need to be 
appropriate to the purposes for reading. The validity of rate 
measurements must always be determined by the reader's rom- 
prehension. 

Purpose. If purposes arc assigned on a test, it would be wise to 
determine each student's acceptance of the assigned purpose through 
some type of introspective report. Statements* of purpose must be 
precise and not subject to differing interpretations. The test by Smith 
(1964) should be used as a model for its excellent statement and 
selection of reading purposes. If the stated purposes do not help 
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readers to make decisions al)out how to adjust their reading 
proccthnes in order lo accomplish tlic assigned purpose, they are of no 
value. In fact, they reduce the validity of the test. Also, the content of 
the passage should be siiital)le i\y its assigned purpose. 

Readability of mnlerials. Tentative research findings suggest the 
avoidance of easy materials v^rittcn at the reader's independent level. 
It would pr()l)ably be advisable to select reasonably large differences in 
readability of materials. Otherwise many of the resulting difference 
scores will not be reliable. The writer recommends the cloze procedure 
as an excellent measure of readability. Unlike readability formulae, it 
measures the readability of the passage for a particular group of 
readers. 

Effects of interest. It is important to use materials of comparable 
interest appeal in most measurements of reading llexibility. This 
factor can be measured with rating scales or with the use of the 
semantic differential technique. 

Additional suggestions. Future tests of scanning might use the 
skimming and scanning subtest of the Reading l est by Raygor as a 
model of good test construction. Finally, a valid test of flexibility 
using different subject materials similar to the Flexibility of Reading 
Test by Braum and Sheldon should be constructed. Such a test would 
be very helpful for teachers. 

Suggested research on measurement 

Confounded variables. There is a need for replicated, well controlled 
studies on the specific effects of various independent variables upon 
different dependent variables involvrd in reading flexibility. Such 
stud-cs should use multivariate designs to determine both the 
independent effects of each variable and also their interactions. 

Differences and ratio scores. Research is needed on ways to improve 
the reliability of difference scores used on correlated measurements. 
Of course, the improvement of the reliability of each individual 
measurement would help. Also, the increase in differences in task 
difficulty to create greater differences in test performance might 
produce more difference scores which would exceed the magnitude of 
their standard error of measurement. 

Technical characteristics. Research is needed on appropriate tech- 
niques for measuring the validity of reading llexibility tests. Litde is 
known about this important problem and very little has been done in 
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determining the validity of llcxihility tests, other than chiims lo the 
establishment oi* content validity, some comparisons of (est results 
with eye-movement data, some correlations of comprehension tests 
w'th other comprehension tests, and comparison of test results with 
introspective reports. Techniques for measuring concurrent, predic- 
tive, and construct validity should be investigated. 

Comprehension measurement. More research is needed which 
compares the practice of measuring comprehension by recall questions 
with the technique of allowing students to refer to the selection in 
answering questions. Study is also needed on techniques for construct- 
ing tests with reading dependent items. More knowledge should be 
obtained about the relationship of criterion referenced test items and 
classical test theory. 

Kate measurement. More knowledge is needed on the formulation 
of instructions to students which influence rate of reading. What are 
the relative effects upon rate and comprehension of different 
instructions such as ''read as rapidly as possible in order to answer 
questions,'' **read at your normal speed," "read at a rate which seems 
most efficient for attaining this purpose"? 

Another matter in need of investigation is the extent to which 
changes in rate reflect changes in reading approaches suitable to 
different purposes. It is possible that, under certain circumstances, a 
change in rate might simply rellect an increase in speed of a particular 
reading process and not a change in approach. As Nacke (1971) has 
noted, we need to investigate precise ways of relating rate to 
comprehension. At present, the procedures for interpreting this 
relationship are vague. 

Purpose. Study is needed on the relative effects of content related 
assigned purposes versus noncontent related assigned purposes. Re- 
search is also needed on the relative effects of including suggestions 
for purpose attainment together with a statement of purpose, as was 
done by Smith (1961), as opposed to using the statement of purpose 
without any such suggestions. More work is needed on the question of 
whether purposes should be assigned at all. Perhaps the reader should 
be free to determine his own purpose if given sufficient information 
upon which to make a decision. The latter condition constitutes the 
truest measure of reading flexibility under most realistic reading 
circumstances. 

Readabiliiy of materials. More research is needed on selecting the 
readability of materials suitable for valid and reliable measures of 
different types of reading flexibility by various groups of readers. 
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Also, criteria for the selection of (lifferem-es in the clifficulty of 
materials lor particular groups of readers need to be determined. 

Intertial flexibility. More investigation of the nature of internal 
(intra-article) flexibility and ways of measuring it is badly needed. A 
technological solution to the problem of machine scoring tests of 
internal flexibility as used by Rankin and Mess (1970) could be usefid. 
Also, the relationship between internal and external llexibility is a 
problem worthy of invcstigat ion. 

Negative fh'xihility. Only one investigator knt)wn to the writer has 
studied negative flexibility. More work is needed on ways of 
measuring this phenomenon and determining its significance. 

Effects of instructional set. Conflicting evidence in the literature on 
this lopic suggests the need for fiu'thcr research. This is a topic of 
practical significance to classroom teachers. 

Effects of interest. Conflicting evidence on the effects of interest on 
reading perfc^rmancc—some of which is based on questionable research 
designs— suggests the need ft)r more sophisticated study of this factor 
in relation to different age groups or students at various reading levels. 

Selection of dependent variables, A vital issue to be determined by 
investigation is, "What kinds of reader behaviors are most suitable for 
the measurement of the effects of a given independent variable?" This 
question deserves systematic study, it is possible that the choice of 
one dependent variable might result in variations with a given 
independent variable while the choice of another dependent variable 
might not indicate any relationship at all. 

Other problems in need of solution. Findings in this review suggest 
several other topics for future research: What are the effects of 
rereading a passage upon the measurement of reading flexibility? What 
are the effects of a practice test upon the measurement of reading 
llexibility? 

Generalization of findings. With all of the limitations of current 
published tests and experimental procedures for measuring reading 
Flexibility, generalizations are difficult to make. A program of 
systematic research and development along the lines suggested by the 
writer's model for reading flexibility, and the suggestions based upon 
this review of the literature, would greatly enhance the development 
of valid measurements of reading llexibility. With an organized 
approach based on the systematic investigation of hypotheses 
suggested by the model, research gaps could be eliminated and 
information useful to teachers could be obtained. 
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